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EVALUATION 


This  effort  is  a  part  of  Research  Area  7,  Electronics,  Sub-Area  3 

Communications.  The  objective  is  to  define  and  assess  adaptive  nulling 

algorithms  compatible  with  adaptive  antennas  with  large  numbers  of  antenna 

3 

elements  or  weights.  This  research  work  supports  RADC  TPO  4A,  C  Surviv¬ 
ability,  Thrust  -  Communications  ECCM.  The  overall  objective  is  to  advance 
the  state-of-the-art  in  adaptive  array  antennas  to  provide  an  Electronic 
Counter  Countermeasure  (ECCM)  capability  for  Air  Force  Communications  Systems. 

In  this  research  effort,  a  new  adaptive  nulling  algorithm  was  formulated 
and  modeled  by  computer  simulation.  The  algorithm  is  based  on  a  newly 
developed  generalized  performance  function  that  allows  specification  of  the 
directivity  pattern  of  the  antenna.  The  new  performance  function  permits 
constraints  on  the  antenna  gain  in  desired  directions  while  minimizing 
interfering  signals.  The  strength  of  the  constraint  can  be  varied  such  that 
deviations  from  it  can  be  controlled,  i.e.,  important  locations  can  be 
stiffened  so  that  deviation  from  it  remain  small  while  constraints  in  less 
important  directions  can  be  made  softer  so  that  larger  vairations  are 
permitted.  This  algorithm  is  considered  to  be  a  significant  advance  over  the 
conventional  least  mean  square  error  (LMS)  algorithm,  allowing  use  of  excess 
degrees  of  freedom  to  specify  the  antenna  pattern.  It  also  will  allow  use  of 
direction-of-arrival  (DOA)  desired  signal  discriminants  in  adaptive  arrays  in 
applications  where  DOA  information  is  not  sufficiently  accurate  for 
conventional  algorithms.  The  next  step  should  be  to  investigate  hardware 
implementation  of  the  algorithm  for  specific  communications  applications. 

/OHN  A.  GRANIERO 
Project  Engineer 


vi  i  i 


j__  :Li KLuul i  l ON 

In  the  field  of  linear  estimation,  a  common  goal  for  the  opt i mum 
filter  is  to  minimize  the  mean  square  error.  The  Widrow-Hoff  Least 
Mean  Square  (LMS)  algorithm  [1,2]  is  a  well  known  algorithm  used  with 
adaptive  filters  to  approach  the  optimum  filter.  Subsequent  researchers 
have  proposed  various  modifications  to  the  LMS  algorithm.  These  modi¬ 
fications  have  been  introduced  when  the  adaptive  filter  does  net 
perform  satisfactorily  with  respect  to  other  criteria  in  which  the 
researcher  is  interested.  Once  modifications  are  introduced  to  the 
algorithm,  the  adaptive  filter  is  no  longer  trying  to  minimize  the 
mean  square  error,  but  is  instead  optimizing  some  other  (often  unstated) 
performance  function. 

This  paper  proposes  the  explicit  addition  of  terms  tc  the  perform¬ 
ance  function  reflecting  the  designer's  additional  criteria.  A 
specific  modification  is  studied:  the  addition  of  "soft"  constraints. 
With  a  soft  constraint,  some  constraint  error  results  because  the 
weights  do  not  exactly  solve  a  specified  set  of  linear  equalities. 

The  optimum  filter  tries  to  minimize  this  constraint  error  simultaneously 
with  the  error  incurred  by  not  performing  perfect  least  mean  square 
estimation.  The  "soft-constraint  LMS  algorithm",  closely  related  to 
the  LMS  algorithm,  is  derived  which  causes  the  adaptive  transversal 
filter  to  approach  the  optimum  filter.  Convergence  properties  of  this 
algorithm  are  studied.  A  relation  wi th . unexpected  properties  between 
the  output  power  of  a  signal  from  the  optimum  filter  and  the  signal's 
input  power  is  derived.  An  application  in  the  area  of  adaptive  antenna 
arrays  is  presented  as  an  example  of  a  use  of  the  proposed  performance 
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function  and  the  corresponding  adaptation  algorithm.  The  relationship 
between  the  soft-constraint  LMS  algorithm  and  other  versions  of  the  LMS 
algorithm  is  discussed. 


11.  PREVIOUS  WORK  IN  MODI  FI EO  LMS  ALGORITHMS 

Adaptive  filters  using  the  LMS  algorithm  have  been  proposed  for  many 
applications  [3-7].  However,  in  some  situations  it  has  been  necessary  or 
desirable  to  modify  the  algorithm  [8-12].  Frost  [9]  proposed  forcing  one 
weights  to  exactly  satisfy  a  set  of  linear  equalities,  which  are  called 
here  a  set  of  "hard  constraints."  This  modification  of  the  LMS  algorithm 
has  been  applied  to  adaptive  antenna  arrays,  to  force  the  gain  of  the 
array  to  be  exactly  unity  in  a  specified  direction,  while  attenuating 
signals  arriving  from  other  directions. 

Another  modification  of  the  LMS  algorithm  is  the  "leaky"  LMS 
algorithm.  This  algorithm  has  a  leak  factor,  so  that  in  the  absence  of 
inputs  the  weights  decay  to  zero.  This  form  has  been  proposed  independ¬ 
ently  by  several  researchers  [11-14].  Using  the  property  that  the  leak 
is  equivalent  to  introducing  a  white  noise  in  the  input  of  the  filter, 
Treichler  [11]  proposed  using  the  algorithm  to  modify  the  characteristics 
of  an  adaptive  line  enhancer  in  a  desirable  manner.  Ahmed  e_t  aj_  [12] 
used  the  leak  effect  to  reduce  numerical  instabilities  occurring  in  their 
application.  White  [13]  showed  that  the  leak  could  reduce  inaccuracies 
caused  by  imperfect  hardware  multipliers. 

Zahm  [14]  used  the  leaky  LMS  algorithm  with  adaptive  antenna  arrays 
to  suppress  strong  "jammers"  in  the  presence  of  weaker  signals.  However, 
using  the  leaky  LMS  algorithm  alone  resulted  in  the  undesirable  character¬ 
istic  that  the  array  rejected  al 1  signals  (and  jammers)  after  a  period  of 
time.  To  counteract  this  effect,  Zahm  introduced  a  set  of  "steering" 
weights  into  the  algorithm,  so  that  the  weights  of  the  adaptive  array 


converge  to  the  steering  weights  in  the  absence  of  any  jammers  or  signals 
These  steering  weights  prevent  the  adaptive  antenna  array  from  turning 
itself  off.  Also,  the  steering  weights  Zahm  chose  as  an  example  introduc 
ed  desirable  effects  in  the  directivity  pattern  of  the  array. 

Extending  Zahm's  work  and  the  work  on  the  leaky  LMS  algorithm 
results  in  the  modification  to  the  LMS  algorithm  discussed  here. 
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III.  DEFINITIONS  AND  TERMINOLOGY  FOR  THE  ADAPTIVE  TRANSVERSAL  FILTER 

Although  applicable  to  any  linear  combiner,  this  work  assumes  for 
ease  of  discussion  that  the  performance  function  and  the  soft-constraint 
LMS  algorithm  developed  later  are  used  with  an  adaptive  transversal 
filter,  as  illustrated  in  Figure  3-1.  Definitions  and  terminology  for 
the  adaptive  filter  follow. 

A  sampled  time  sequence  u(j)  is  the  input  to  an  n-1  delay  transver¬ 
sal  filter,  where  j  is  the  time  index  of  samples  taken.  The  n  weights 
Wq,  w^ ,  ....  w  -j  can  be  adjusted  by  the  adaptation  algorithm  as  time 
progresses.  The  filter  output  y(j)  is  compared  against  a  time  sequence 
d(j),  which  is  called  the  desired  signal.  (The  source  and  nature  of  the 
desired  signal  varies  with  the  application.)  The  purpose  of  tne  filter 
is  to  provide  an  estimate  y(j)  of  the  desired  signal  d(j).  The  differ¬ 
ence  between  d(j)  and  y(j)  is  called  the  error  signal  e(j). 

The  input  sequence  u(j)  may  contain  one  or  all  of  three  types  of 
signals.  A  signal  may  be  noise;  it  may  be  a  deliberately  produced 
sequence  but  of  no  use  in  forming  the  estimate  (an  interferer  or  jammer); 
or  it  may  be  a  sequence  relevant  to  estimating  d(j). 

The  values  at  the  taps  of  the  transversal  filter  at  time  j  are 
denoted  by  the  data  vector  X(j): 

X(j)  =  [u(j)  u(j-l)  ...  u( j-n+1 ) ]T  .  (3-1) 

The  set  of  n  weights  is  written  in  vector  form  as: 

W  =  [wQ  w1  ...  wn.1]T  .  (3-2) 
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Then  the  filter  output  y(j),  the  estimate  of  d(j),  is  expressed  as 
the  inner  product  of  two  vectors: 

y(j)  =  XT(j)W  =  WTX(j)  .  (3-3) 

The  error  signal  is  simply 

e(j)  =  d(j)  -  y(j)  .  (3-4) 
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IV.  THE  PERFORMANCE  FUNCTION 


Any  function  measuring  the  performance  of  an  adaptive  filter  must 
reflect  the  concerns  of  the  filter  designer.  The  basic  consideration, 
is  often  simply  that  estimation  is  being  performed.  The  performance  of 
an  adaptive  filter  in  estimating  an  unknown  signal  is  often  measured 
by  the  mean  square  error,  a  widely  used  criterion  [15-17]. 

But  the  designer  may  recognize  additional  considerations  in  some 
applications.  Using  an  adaptive  antenna  array  as  an  example,  it  is 
sometimes  desirable  to  specify  the  array  gain  in  a  particular  direction 
provided  this  requirement  does  not  increase  the  estimation  error 
(mean  square  error)  excessively.  But  if  the  estimation  error  does 
increase  too  much,  it  may  be  possible  to  decrease  it  significantly  while 
still  staying  close  to  (though  not  exactly  meeting)  the  gain  specifica¬ 
tion. 

The  array  example  shows  that  the  performance  function  must  do  two 
things.  First,  it  must  measure  the  extent  to  which  the  array  gain 
specification  is  violated  by  the  chosen  filter,  as  well  as  measure  the 
estimation  error.  Second,  the  performance  function  must  weight  the 
relative  contributions  of  the  estimation  error  and  of  the  gain  specifi¬ 
cation  error,  so  that  a  balance  may  be  struck  between  the  two  sources 
of  error. 

A  performance  function  p(j)  satisfying  these  two  considerations  is 

.  m  - 

p(j)  =  E{e2(j)}  +  £  Meri'  »  (4-1) 

i=l  1  C1 

2 

where  E{a}  denotes  the  expected  value  of  a.  The  first  term,  E{e  (j)}. 


is  the  mean  square  error  (the  estimation  error);  the  second  term  is  the 
weighted  contribution  of  the  specification  ("constraint")  errors. 
Definition  (4-1)  assumes  that  the  designer  specifies  m  constraints,  and 
that  the  error  resulting  from  not  exactly  meeting  constraint  i  is  e  . . 
The  designer  selects  the  non-negative  constant  to  specify  the 
relative  importance  of  the  itfl  constraint  error  compared  to  the  estima¬ 
tion  error.  The  greater  b.  is,  the  more  the  error  e^  affects  the 
value  of  the  performance  function. 

The  relation  between  the  constraint  errors  e  .  and  the  adaptive 

ci 

filter's  weights  is  still  unspecified.  The  filter  designer  is  free  to 
choose  any  function.  Different  selections  will  produce  different 
adaptation  algorithms.  The  form  for  constraint  error  studied  in 
this  paper  is  a  linear  function  of  the  weight  vector. 

As  an  example,  return  to  the  adaptive  antenna  array.  Let  the  ith 
constraint  specify  the  desired  gain  in  a  particular  direction  at  a 
specified  frequency.  The  actual  gain  of  the  array  in  this  direction 
(at  the  specified  frequency)  is  calculated  by  a  linear  expression: 

gain  =  aTw  ,  (4-2) 

where  A.  is  a  constant  vector  with  n  components.  (Section  X  contains 
details  for  constructing  Ai .)  Thus,  if  the  desired  gain  is  the  scalar 
h.,  the  constraint  error  is: 

ed  '  AI“  -  hi  ■  (4'3) 

Using  this  form  for  the  constraint  error  in  (4-1)  results  in  the 
performance  function  studied  here: 


(4-4) 


P(j)  -  E{e2(j)}  +  fb^W-h/  . 

i  =  l 

This  performance  function  is  written  in  matrix  form  as: 

p(j)  -  Eie2(j)}  +  (AW  -  H)TB(AW  -  H)  ,  (4-5) 

where  A  is  the  mxn  matrix  composed  of  the  vectors  A^ : 

A  [A,  A2  ...  A,  ...  A/  ;  (4-6) 

B  is  the  mxm  diagonal  matrix  with  diagonal  elements  b^: 

B  =  diag[b1 ,b2>...  ,b  ... . ,bm]  ;  (4-7) 

and  H  is  an  m  dimensional  vector  composed  of  the  individual  desired 
constraint  values  h.. : 

H  =  [h]  h2  ...  h.  ...  hm]T  .  (4-8) 

This  performance  function  (4-5)  will  be  called  a  "soft-constraint  least 
mean  square  error  performance  criterion."  The  constraints  are  called 
soft  because,  unlike  constraints  in  most  optimization  problems,  they 
can  be  violated  (not  satisfied  exactly). 

The  goal  of  the  adaptation  algorithm  which  is  developed  in  section 
VIII  is  to  find  the  weight  vector  that  minimizes  the  performance 
function  p(j)  in  (4-5). 

The  dependence  of  p(j)  on  the  weight  vector  W  is  important.  The 
absence  of  non-global  minima  is  desired,  since  this  absence  helps 
prevent  an  adaptation  algorithm  from  settling  to  an  incorrect  weight 
vector  (i.e.  finding  a  local  optimum).  The  dependence  of  p(j)  on  W  is 
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obtained  by  expanding  the  mean  square  error  term  in  (4-5)  and  using 
(3-3): 

E(e2(j)}  =  E{[d(j)-y(j)]2} 

■  E{d2(j)  -  2d ( j ) XT ( j ) W  +  WTX(j)XT(j)W} 

=  E{d2(j)}  -  2PT(j)W  +  WTR(j)W  ,  (4-9) 

where  the  cross  correlation  between  the  data  vector  X(j)  and  the 
desired  signal  d(j)  is  denoted  by  P(j): 

P(j)  =  E{d(j)X(j)}  ;  (4-10) 

and  R(j)  denotes  the  autocorrelation  matrix  of  the  data  vector: 

R(j)  =  E{X(j)XT(j)}  .  (4-11) 

Substituting  (4-9)  into  (4-5)  expresses  the  performance  function 
directly  in  terms  of  the  weight  vector  W: 

p(j)  »  E{d2(j)}  -  2PT( j )W  +  WTR(j)W  +  (AW  -  H)TB(AW  -  H)  .  (4-12) 

Clearly,  p(j)  is  a  quadratic  function  of  the  weights.  Because  it  is  a 
sum  of  squared  quantities,  it  cannot  be  negative.  Thus  one  of  two 
situations  exists.  The  first  possibility  is  that  there  is  exactly  one 
minimum  to  the  performance  function,  and  only  one  weight  vector  achieves 
this  minimum.  This  situation  may  be  visualized  as  a  parabolic  bowl  in  a 
hyperspace  of  dimension  n.  The  second  possibility  is  that  the  perform¬ 
ance  function  attains  the  same  minimum  value  for  a  whole  set  of  weight 
vectors.  In  this  case  the  set  of  weight  vectors  forms  a  connected 
space  so  that  all  minima  of  the  performance  function  are  adjacent  to  one 
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another;  there  are  no  isolated  minima.  This  situation  may  be  visualized 
in  an  n-dimensional  hyperspace  as  a  trough,  equally  deep  at  all  points 
along  the  bottom  of  the  trough,  and  with  parabolic  sides  to  the  trough. 

This  paper's  primary  interest  is  on  the  first  case,  where  the 
weight  vector  yielding  the  optimum  (minimum)  value  of  the  performance 
function  is  unique.  The  analysis  can,  if  desired,  be  extended  to  the 
second  situation,  by  essentially  considering  a  smaller  hyperspace  which 
contains  a  unique  minimum. 
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V.  THE  OPTIMUM  FILTER 

This  section  derives  an  expression  for  the  optimum  weight  vector, 
defined  here  as  the  unique  weight  vector  specifying  the  filter  which  has 
the  optimum  (minimum)  performance  p(j).  The  condition  under  which  the 
minimum  p(j)  occurs  with  a  non-unique  weight  vector  is  also  determined. 

Any  weight  vector  W  minimizing  the  performance  function  p(j)  forces 
the  gradient  of  p(j)  to  zero.  From  (4-5),  the  overall  gradient  of  p(j) 
with  respect  to  W  is: 

7wp(j)  =  7wE(e2(j)}  +  7W[(AW  -  H)TS(AW  -  H)]  .  (5-1) 

Analyzing  the  first  term  by  taking  the  gradient  of  (4-9)  yields: 

7wE(e2(j)}  =  -2[P(j)  -  R(j)W]T  .  (5-2) 

This  first  term  of  the  overall  gradient  comes  from  the  mean  square  error 
(estimation  error)  term  of  the  performance  criterion.  This  is  the  same 
gradient  used  to  develop  the  IMS  algorithm. 

Analyzing  the  second  term  of  (5-1)  yields: 

7WC(AW  -  H)TB(AW  -  H) ]  «  2[ATB(AW  -  H)]T  .  (5'3) 

This  second  term  is  due  entirely  to  the  soft  constraints  imposed  by  the 
filter  designer. 

From  (5-1)  the  overall  gradient  is  the  sum  of  (5-2)  and  (5-3): 

7wp(j)  =  -2[P(j)-R(j)W]T  +  2[ATB(AW  -  H)]T  .  (5-4) 

The  optimum  value  for  W  occurs  when  the  gradient  (5-4)  is  set  equal 
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to  zero,  yielding: 


[R(j)  +  ATBA]W  =  P(j)  +  ATBH  .  (5-5) 

Thus  the  necessary  condition  for  the  optimum  (minimum)  performance  to 
occur  at  a  unique  weight  vector  is  that  the  matrix  R( j)+ATBA  be  nonsin¬ 
gular.  Under  this  condition,  the  unique  optimum  weight  vector,  denoted 
Wopt(j).  is: 

Wopt(j)  =  [R(j)+ATMl'1[P(j)+ATBH]  .  (5-6) 

Note  that  it  is  not  necessary  for  either  R(j')  or  A^BA  to  be  nonsingular. 
In  fact,  one  use  for  the  soft-constraint  LMS  algorithm  arises  when  the 
data  vector  autocorrelation  matrix  R(j)  is  indeed  singular  (or  possibly 
just  ill-conditioned).  In  such  a  case  a  set  of  soft  constraints  can  be 
generated  to  yield  a  unique  optimum  weight  vector,  as  has  been  done  with 
the  leaky  LMS  algorithm  [12]. 
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VI.  AN  ASSUMPTION  OF  STATIONARITY 


The  remainder  of  this  work  assumes  that  the  signals  d(j)  and  u(j) 
re  generated  by  stationary  stochastic  processes.  Thus,  the  statistics 
(j)  and  R(j),  the  performance  criterion  p(j),  and  the  optimum  weight 
ector  W  t(j)  are  constant,  and  are  now  denoted  by  P,  R_,  p,  and  WQpt 
espectively,  dropping  the  time  index  j. 

Using  this  assumption  of  stationari ty,  the  performance  function 
i  is  now  written  from  (4-5)  as: 

p  =  E{e2 ( j ) }  +  (AW  -  H)TB(AW  -  H)  ,  (6-1) 

the  gradient  of  p  from  (5-4)  is: 

Vwp  =  -2[P  -  RW]T  +  2[ATB(AW  -  H)]T  ,  (6-2) 

and  the  optimum  weight  vector  WQpt  is  written  from  (5-6)  as: 

WQpt  =  (R+ATBA)_1(P+ATBH)  •  (6-3) 
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VII.  DETERMINATION  OF  THE  OPTIMUM  WEIGHT  VECTOR  BY  GRADIENT  SEARCH 


Calculating  the  optimum  weight  vector  using  (6-3)  is  not  always  fea¬ 
sible,  even  when  all  quantities  are  known.  This  may  be  due  to  the  size  of 
the  filter,  or  to  numerical  difficulties  caused  by  properties  of  the  mat¬ 
rices.  Thus  alternative  approaches  have  been  devised.  A  common  technique 
is  to  make  successive  approximations  to  the  optimum  weight  vector.  Given 
one  estimate  of  the  weight  vector,  denoted  by  W(j),  the  next  estimate, 
W(j+1),  is  iteratively  generated  from  W(j),  governed  by  how  well  W(j) 
satisfies  (5-5).  The  time  index  j  denotes  sequential  estimates  since  it 
is  assumed  here  that  one  update  occurs  at  each  time  instant. 

The  technique  of  successive  approximation  used  in  this  research  is 
called  gradient  search  [18].  The  gradient  of  the  performance  surface 
is  calculated  for  the  current  value  of  the  weight  vector,  W(j).  The  grad¬ 
ient  specifies  the  direction  of  weight  vector  change  which  will  increase 
the  performance  function  p  most  rapidly;  but  since  the  goal  is  to  reduce 
the  performance  function,  the  next  estimate  of  the  optimum  weight  vector 
is  obtained  by  moving  from  the  current  estimate  in  the  direction  opposite 
to  that  of  the  gradient,  through  a  distance  proportional  to  the  magnitude 
of  the  gradient: 

W(j+1)  =  W(j)  -  yvjp  ,  (7-1) 

where  p  is  a  positive  constant  chosen  by  the  filter  designer. 

Using  (6-2)  for  V^p  in  (7-1)  gives  the  update  equation: 

W(j+1)  =  W(j)  +  2y[P-RW( j ) ]  -  2yATB[AW(j)-H]  .  (7-2) 
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Repeatedly  using  this  update  equation  causes  the  estimate  of  the  optimum 
weight  vector  to  approach  the  actual  optimum  WQpt  of  (6-3),  provided  u  is 
small  enough  (see  section  IX). 
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VIII.  THE  SOFT-CONSTRAINT  LMS  ALGORITHM 


The  algorithm  (7-2)  for  approaching  the  optimum  weight  vector  W 
is  applicable  only  if  all  quantities  are  known.  Such  knowledge  is  gener¬ 
ally  not  available  in  practice.  If  the  statistics  of  the  input  signal 
u(j)  are  unknown,  then  P  and  R.  are  unknown.  This  can  occur  when  a  known 
signal  is  subject  to  additive  noise,  is  passed  through  a  filter  whose 
characteristics  are  not  perfectly  known,  or  is  distorted.  Nevertheless, 
it  is  still  possible  to  perform  signal  estimation  subject  to  soft  con¬ 
straints.  To  do  this,  the  update  equation  (7-2)  is  modified  by  replacing 
P  and  R  with  estimates.  The  estimates  chosen  must  depend  upon  the  input 
signal  u(j),  so  that  these  estimates  are  based  on  data  statistics, 
rather  ti..n  on  a  priori  guesses.  The  estimates  chosen  are: 


P  =  d(j)X(j)  , 

R  =  X(j)XT(j)  . 


(8-1) 


It  is  easily  shown  that  these  estimates  are  unbiased: 

E{P)  =  H{d(j)X(j)>  =  P  , 

E { R }  =  E{X( j )XT( j ) }  =  R  . 


(8-2) 


Thus  the  gradient  search  algorithm  (7-1)  is  replaced  by: 

W(j+D  =  W(j)  -  pvjp  ,  (8-3) 

where  an  estimate  of  the  gradient  is  used  in  place  of  the  true  gradient. 
This  means  that  the  estimates  P  and  R  replace  the  true  values  of  P  and  R, 
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in  (6-2),  yielding: 


Vwp  =  -2[P-RW( j ) ]T  +  2 {ATBCAW( j)-H]}T  .  (8-4) 

Substituting  (8-1)  into  (8-4)  results  in: 

VWP  =  -2[d(j)X(j)-X(j)XT(j)W(j)]T  +  2{ATB[AW(j)-H]'T  .  (8-5) 

Using  definitions  (3-3)  and  (3-4)  in  (8-5)  and  rearranging  yields: 

7UP  =  -2e  ( j )  XT(  j )  +  21ATB[AW(j)-Ki]>T  .  (:>6) 


Substituting  (8-6)  into  the  update  equation  (8-3)  results  in: 


W{ j+1 )  =  W(j)  +  2pe ( j ) X ( j )  -  2yAiB[AW(j)-H]  , 


(3-7) 


gradient  due  gradient  due  to 
to  estimation  constraint  errors 
error 


or; 


W(j+1)  =  (i-2uATBA)W(j)  +  2ye(j)X(j)  +  2pATBH  (8-8) 


Equations  (8-7)  and  (8-8)  are  alternate  forms  of  what  is  defined  here  as 
the  "soft-constraint  LMS  algorithm." 
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IX.  STATISTICAL  PROPERTIES  OF  THE  SOFT-CONSTRAINT 
LMS  ALGORITHM  WEIGHT  VECTOR 

Random  Noise  in  the  Weight  Vector 

If  the  steepest  descent  update  algorithm  in  (7-2)  is  used  to  adapt 
the  weight  vector,  the  resulting  weight  vector  sequence  depends  only  on 
P,  R,  and  the  weight  vector's  initial  value.  In  this  ideal  case,  P  and 
R^  are  known  a  priori.  Hence,  a  specific  sequence  is  generated,  regard¬ 
less  of  which  ensemble  member  of  the  stochastic  processes  generating 
u(j)  and  d(j)  occurs.  This  is  not  true,  however,  with  the  soft 
constraint  LMS  algorithm  (8-7)  or  (8-8).  Although  the  gradient  tern  in 
(8-4)  due  to  the  soft  constraints  is  calculated  perfectly  from  the 
designer's  specification  of  the  soft  constraints  and  knowledge  of  the 
current  weight  vector,  the  term  due  to  the  mean  square  error  is  only  an 

A  A 

estimate,  since  P  and  R  are  estimates  of  P  and  FL  The  estimate  chosen 
results  in  a  random  quantity,  since  it  depends  on  the  actual  sequences 
u(j)  and  d(j).  This  results  in  an  ensemble  of  weight  vector  sequences. 
The  ensemble  can  be  pictured  as  arising  from  a  bank  of  adaptive  filters, 
all  beginning  with  the  same  initial  weight  vector,  but  all  receiving 
different  ensemble  members  for  u(j)  and  d(j).  This  section  discusses 
the  statistical  properties  of  this  ensemble  of  weight  vectors. 
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Convergence  of  the  Mean  Weight  Vector 

Theorem  1 :  Convergence  of  the  Mean  Weight  Vector 
If  1)  The  soft-constraint  IMS  algorithm  (8-7)  or  (8-8)  produces 
a  weight  vector  sequence  W(j)  from  a  data  vector  seauence 
X(j)  and  a  desired  signal  sequence  d(j),  and  if 

2)  W(j)  and  X(j)  are  statistically  independent,  and  if 

3)  The  matrix  R+ATBA  is  nonsingular,  and  if 

4)  0  <  w  <  y — • 

max 

where  -  max{eig{R+A^BA}}, 

iuqX 

eig{Y)  is  the  set  of  eigenvalues  of  matrix  Y_, 
and  max{a  set}  is  the  maximum  value  of  the  set, 

Then  in  the  limit  the  mean  weight  vector  converges  to  the  optimum 
weight  vector: 


1 im  E{W(j)}  =  W  .  ,  (9-1) 

j-*co  r 

where  E { W ( j ) }  is  the  expected  value  (mean)  taken  over  the  ensemble 
of  weight  vectors  at  time  j. 

The  proof  of  this  theorem  is  contained  in  Appendix  A. 

The  definition  of  convergence  used  in  Theorem  1  is  weaker  than  that 
used  with  stochastic  approximation  methods  [19],  The  latter  require 
that  in  addition  to  the  mean  weight  vector  converging  to  the  optimum 
value  as  given  in  (9-1),  the  weight  vector's  covariance  must  go  to  zero; 
meaning  that  every  member  of  the  ensemble  of  weight  vector  sequences 
must  approach  the  optimum.  However,  stochastic  approximation  methods 
suffer  from  the  disadvantage  that  if  the  signal  statistics  vary  slowly 
(are  not  strictly  stationary),  the  weight  vector  cannot  track  the  time- 
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varying  optimum  value.  By  contrast,  the  soft-constraint  LMS  algorithm 
can  follow  a  slowly  moving  optimum  weight  vector;  the  exact  character¬ 
istics  in  this  environment  are  a  subject  for  further  study. 

The  second  condition  of  Theorem  1,  that  for  convergence  W(j)  and 
X(j)  must  be  statistically  independent,  is  not  met  when  the  soft-constraint 
LMS  algorithm  is  applied  to  a  transversal  filter  (Figure  3-1).  Due 
to  the  nature  of  the  algorithm,  W(j)  is  a  function  of  all  past  data 
vectors  up  to  X(j-l).  And  because  of  the  operation  of  a  tapped  delay 
line,  X(j)  is  a  vector  consisting  of  exactly  n-1  elements  of  the  vector 
X(j-l).  Thus,  since  W(j)  and  X(j)  are  both  functions  of  X(j-l),  they 
cannot  be  statistically  independent  of  each  other.  However,  when  the 
adaptation  constant  y  is  small,  W(j)  depends  only  weakly  on  X(j-l); 
hence  the  cross-correlation  between  W(j)  and  X(j)  is  small,  yielding  a 
close  approximation  to  the  assumption  of  independence.  The  effect  of 
violating  this  assumption  has  been  studied  for  the  LMS  algorithm 
[20-22],  with  the  conclusion  that  the  weight  vector  mean  converges  to 
a  value  which  is  biased  away  from  the  optimum  weight  vector;  but  a 
value  as  close  as  desired  to  the  optimum  can  be  attained  by  making  y 
small.  It  is  expected  that  the  same  behavior  can  be  proved  for  the 
soft-constraint  LMS  algorithm,  due  to  the  close  similarity  to  the 
standard  LMS  algorithm.  Experience  with  the  algorithm  supports  this 
expectation. 

Note  that  the  maximum  value  for  u  permitting  convergence  of  the 
mean  weight  vector  (condition  4  of  Theorem  1)  depends  on  Amflx,  which 
is  unknown  when  R  is  unknown  a  priori .  However,  an  upper  bound  on 
Amax  which  is  easy  to  compute  in  an  actual  problem  is; 


Am„v  =  max{eig{R+ATBA}} 

iTldX  —  - 

£  max(eig{R}}  +  max{eig{ATBA}}  .  (9-2) 

This  is  true  since  all  eigenvalues  of  £  and  A^BA  are  non-negative,  so 
the  maximum  eigenvalue  of  (R+A^BA)  cannot  be  larger  than  the  sum  of  the 
maximum  eigenvalues  of  R  and  A^BA.  Now  the  maximum  eigenvalue  of  A^BA 
is  available,  since  these  matrices  are  predetermined  by  the  filter 
designer.  And  since  the  trace  of  an  autocorrelation  matrix  is  the  sum 
of  its  (non-negative)  eigenvalues,  the  maximum  eigenvalue  of  R  must  be 
less  than  or  equal  to  the  trace  of  R,  which  is  just  n  times  the  input 
power  E{u  (j)}  to  the  filter.  Thus 

^  <  Tr[R]  +  max{eig{A^BA}} 

iTla  X  —  —  

£  n[E(u2(j)}]  +  max{eig{ATBA}}  ;  (9-3) 

so  that  a  sufficient  condition  on  p  to  satisfy  assumption  4  of  Theorem 
1  is: 

0  <  u  <  - 5 - - - r -  ’  (9-4) 

n[E(u  (j)}]  +  max{eig{A  BA}} 

which  can  be  calculated  without  a  priori  knowledge  of  R.  It  is  generally 
more  restrictive  than  the  bound  of  Theorem  1. 

The  proof  of  Theorem  1  in  Appendix  A  points  out  the  interesting 
fact  that  the  mean  weight  vector  follows  exactly  the  same  trajectory 
that  the  weight  vector  would  follow  if  perfect  gradient  measurements 
were  available.  Thus  the  approximation  to  the  gradient  (inclusion  of 
"gradient  noise")  does  not  change  the  convergence  rate  of  the  mean 
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weight  vector. 


Weight  Vector  Covariance 


The  weight  vector  covariance  measures  how  much  individual  members 
of  the  ensemble  of  weight  vector  sequences  vary  from  the  mean  weight 
vector.  Since  the  mean  weight  vector  converges  in  the  limit  to  tne 
optimum  weight  vector,  the  greater  the  weight  vector  covariance  is, 
the  further  individual  members  of  the  ensemble  of  weight  vector  sequence 
are  from  the  optimum  in  the  limit.  This  variation  implies  poor  per¬ 
formance.. 


The  weight  vector  covariance  matrix  C.ww(j) 


is  defined  by: 


cyj)  =  E{[W(j)-W(j)]£W(j)-W(j)]T} 


=  E(W( j )WT( j ) }  -  W(j)WT(j)  ,  (9-5) 


where  E{W(j)},  the  mean  weight  vector,  is  written  as  W(j)  to  simplify 
notation. 

Theorem  2:  Weight  Vector  Covariance 
If  1)  Theorem  1  holds,  and  if 

2)  W ( j )  and  d(j)  are  statistically  independent,  and  if 

3)  d(j)  and  u(j)  are  gaussianly  distributed, 

then  the  recursion  equation  for  the  weight  vector  covariance  is: 

C^j+l)  =  [I-2y(R+ATBA)]Cww(j)[I-2u(R+ATBA)3 

+  4b2  RCww(j)E.  +  RMC^mR]  +  RE{e2(j)iH=S(J)' 

+  [P-RW(j)][P-RW(j)]T  (9-6) 

The  proof  of  this  theorem  is  contained  in  Appendix  B. 
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It  is  important  to  know  when  the  weight  vector  covariance  matrix 
remains  bounded,  since  on  occasion  the  mean  weight  vector  will  converge 
to  the  optimum  value,  while  the  weight  vector  covariance  matrix  grows 
without  bound.  This  means  that  the  individual  members  of  the  weight 
vector  sequence  ensemble  vary  around  the  proper  solution,  but  the  vari¬ 
ations  grow  larger  and  larger.  Such  a  situation  is  undesirable.  The 
following  analysis  finds  conditions  where  the  weight  vector  covariance 
matrix  is  guaranteed  to  remain  finite,  and  finds  other  conditions  where 
the  weight  vector  covariance  matrix  is  guaranteed  to  grow  without 
bound.  The  behavior  of  the  weight  vector  covariance  matrix  is  unde¬ 
termined  for  the  remaining  cases. 

The  trace  of  the  weight  vector  covariance  matrix  measures  its 
"magnitude".  All  off-diagonal  terms  of  a  covariance  matrix  are  less 
than  or  equal  in  magnitude  to  the  largest  of  the  diagonal  termsT;  and 
the  diagonal  terms  are  all  positive1,  so  the  trace  upper  bounds  the 
magnitude  of  every  element  of  the  covariance  matrix.  Applying  the 
trace  (a  linear  operator)  to  each  term  of  (9-6)  yields  the  recursion 
equation  for  the  trace  of  the  weight  vector  covariance  matrix: 

TrCcww(j+T  )3  =  Tr  { Cl-  2u  ( R+ATBA )  ( j )  CJL'2u  (  r+atba  )  3 } 

+  ^{TrCRC^UJR]  +  Tr[R]Tr[Cww(j)R] 

*  [P-£S?U)]V-£w(j)]}  •  (9-7) 

This  results  from  applying  Schwarz's  Inequality  to  the  autocorrelation 
function  of  a  stationary  process. 
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Theorem  3:  Sufficient  condi tions  for  boundedness  of  the  trace 
of  the  weight  vector  covariance  matrix 
1)  If  Theorem  2  holds,  and 

2>  if  a>  *  WTr<&  i  Wmln 

and 

0<y<-j - - ; 

Xmin  +  Ymax  +  YmaxTr^ 


or  if  b)  y  +  y  Tr[Rl  >  X  X  . 

1  'max  'max  -  max  min 

and 


max 


0  <  u  <  -? - p - 

X  +  y  +  y  TrfRl 
max  'max  'max  L-J 


where 


Ymax  =  max{eig{R}} 

A  T 

Xmav  =  max{eig{R+A  BA}} 

Mid  A 


A  T 

Xm^n  =  min[eig{R+A  BA}} 


Then  Tr[C^w(j)J  will  be  bounded  for  all  time. 


The  proof  of  this  theorem  is  contained  in  Appendix  C. 

Theorem  3  presents  conditions  on  p  which  are  sufficient  to 
guarantee  that  Tr[Cww(j)]  is  a  bounded  sequence.  Next,  necessary 
conditions  for  Tr[Cww(j)]  to  be  a  bounded  sequence  are  determined; 
however,  these  are  not  sufficient  conditions. 
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Theorem  4:  Necessary  condi tions  for  boundedness  of  the  trace 
of  the  weight  vector  covariance  matrix 

For  Tr[C^(j)]  to  be  a  bounded  sequence,  it  is  necessary  that 

1)  Theorem  2  hold,  and  also  that 

2)  a)  When  yf:.  +  ymi„Tr[R]  >  A*"  , 

min  mm  u— J  —  max 


that  0  <  p  < 


max 


Xmax  +  Ymin  +  Ymin"!'r^ 


b)  When  A^.  <  y^.  +  y  .  Tr[R]  <  A^  , 

min  -  'min  'min  L-J  -  max 


that  0  <  p  < 


rmin  +  YminTr^ 


c)  When  y^.  +  ym4nTr[R]  <  A)*.  , 

min  min  —  —  min 


that  0  <  p  < 


Xmin 


Xmin  +  Ymin  +  YminTr[y 


The  proof  of  this  theorem  is  contained  in  Appendix  D. 

Figure  9-1  demonstrates  some  of  the  interrelationships  among  the 
bounds  on  p  presented  in  Theorems  1,  3,  and  4.  It  will  be  seen  that 
satisfying  the  bound  on  u  is  not  always  adequate  to  obtain  good  per¬ 
formance.  The  figure  is  an  example,  obtained  by  plotting  the  various 

2 

bounds  on  y  as  a  function  of  the  power  of  a  signal,  a  ,  in  a  particular 
environment^. 

figures  9-1  and  9-2  were  obtained  by  assuming  that  a  six  tap  filter 

2 

is  being  used,  receiving  a  signal  of  power  a  and  a  noise  of  power 
1,  with  ymax  receiving  half  the  input  power,  and  the  rest  of  the  power 

distributed  evenly  among  the  remaining  eigenvalues,  and  assuming  that 

the  eigenvalues  of  A^BA  all  have  a  value  of  15. 
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Abbreviations : 

MWV  =  Mean  Weight  Vector 
WVC  =  Weight  Vector  Covariance' 


Curves : 
A: 


Bound  on  p  guaranteeing  MWV 
convergence  (from  Theorem  1) 
Bound  on  p  guaranteeing  WVC 
boundedness  (from  Theorem  3) 
Bound  on  p  necessary  for  WVC 
boundedness  (from  Theorem  4) 


MWV  diverges 


MWV  converges, 
WVC  is  bounded 


Signal  Power  a 


Figure  9-1:  Sample  relationships  of  bounds  on  p 
from  Theorems  1,  3,  and  4  (assuming 
all  other  Theorem  conditions  met) 
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Assuming  all  other  conditions  of  Theorems  1  through  4  are  met, 
if  y  lies  below  curve  A,  then  the  mean  weight  vector  will  converge. 

If  p  lies  above  A,  then  the  mean  weight  vector  will  diverge,  and  the 
conditions  on  Theorems  2,  3,  and  4  are  not  satisfied,  so  behavior  of 
the  weight  vector  covariance  matrix  is  unknown.  If  y  lies  below 
both  curves  A  and  B,  the  weight  covariance  matrix  is  guaranteed  to  be 
bounded.  The  snaded  area  at  the  lower  right  of  Figure  9-1  where  curve 
A  lies  above  curve  C  is  particularly  interesting.  If  u  lies  within 
this  shaded  region,  the  mean  weight  vector  converges  because  y  is  below 
curve  A,  but  the  weight  vector  covariance  matrix  is  guaranteed  to 
grow  without  bound.  This  performance  is  unacceptable  even  though  the 
mean  weight  vector  converges. 

Contrasting  with  the  above  is  the  area  at  the  extreme  left  of 
the  figure  (low  signal  power)  where  curve  A  lies  below  curve  B,  the 
bound  on  y  which  guarantees  that  the  trace  of  the  weight  vector  covar¬ 
iance  matrix  remains  bounded.  In  this  case,  and  in  this  case  only, 
satisfying  the  bound  on  y  to  guarantee  convergence  of  the  mean  weight 
vector  also  guarantees  that  the  weight  vector  covariance  will  remain 
finite. 

There  are  other  areas  of  the  figure  where  the  mean  weight  vector 
is  guaranteed  to  converge,  but  it  is  unknown  if  the  weight  covariance 
matrix  will  remain  finite  or  not.  Thus  Figure  9-1  demonstrates  that 
the  bound  on  y  from  Theorem  1  by  itself  is  insufficient  to  guarantee 
desired  behavior;  the  bounds  from  Theorems  3  and  4  must  also  be  con¬ 
sidered. 

A  single  upper  bound  for  y  is  desired  which  guarantees  that  the 
mean  weight  vector  converges,  and  also  guarantees  that  the  weight 
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vector  covariance  matrix  remains  bounded.  A  useful  bound  should  be 


calculable  from  prior  knowledge  and  input  signal  power  only,  and  not 
depend  upon  knowledge  about  the  data  vector  autocorrelation  matrix  R. 

It  can  be  shown'  that  the  bound  in  (9-4),  which  meets  the  criteria  of 
caiculabil ity  and  guarantees  that  the  mean  weight  vector  converges, 
does  not  guarantee  boundedness  of  the  weight  vector  covariance  matrix. 
However,  a  bound  satisfying  these  conditions  is: 

0  <  u  <  - - -  .  (9-3) 

3Tr[R]  +  Tr[ATBA] 

Appendix  E  demonstrates  that  this  bound  satisfies  the  criteria  listed 
above. 

The  bound  (9-8)  is  plotted  as  curve  D  in  Fig.  9-2,  along  witn  the 
bounds  for  convergence  of  the  mean  weight  vector  and  the  bound  guar¬ 
anteeing  boundedness  of  the  weight  vector  covariance  matrix.  This 
figure  confirms  that  the  bound  (9-8)  lies  below  the  other  bounds.  It 
also  demonstrates  that  the  bound  (9-8)  can  be  overly  restrictive, 
since  it  lies  so  far  below  curves  A  and  B.  The  distance  between  the 
bound  (9-8)  and  the  curves  A  and  B  aepends  partially  on  how  closely 
the  traces  of  the  matrices  are  related  to  the  maximum  eigenvalues;  the 
closer  the  trace  is  to  the  maximum  eigenvalue,  the  closer  (9-8)  will 
be  to  curves  A  and  B. 


For  example,  consider  the  scalar  case  with  Ymax=Ym.jn=l .  Tr[R]=l, 
max{eigfA"''BA};=:l  (which  implies  X  =2). 


30 


i 


Signal  Power  a 


Figure  9-2:  A  calculable  bound  on  y  guaranteeing 
mean  weight  vector  convergence  ana 
weight  vector  covariance  boundedness 
(assuming  all  other  Theorem  conditions 


X.  AN  APPLICATION  TO  ADAPTIVE  ANTENNA  ARRAYS 


This  section  demonstrates  an  application  of  the  soft-constraint 
LMS  algorithm  to  adaptive  antenna  arrays.  The  soft  constraints  are 
used  to  affect  the  shape  of  the  antenna  array's  directivity  patten 

Fig.  10-1  shows  the  adaptive  antenna  array  system.  Each  of  V 
six  antenna  elements  is  omnidirectional.  The  output  of  each  anten' 
element,  s^  through  s^,  is  fed  to  its  own  two-tap  (and  two-weight) 
adaptive  transversal  filter  (TF^  through  TFg).  The  summed  outputs 
of  the  filters  form  the  system  output  y( j).  It  is  assumed  that  no 
desired  signal  d(j)  is  available,  so  the  error  signal  e(j)  is  the 
array's  output  y(j).  The  objective  of  the  algorithm  with  this  system 
is  to  minimize  the  output  power;  simultaneously  trying  to  keep  the 
array's  gain  in  certain  directions  at  certain  frequencies  close  to 
values  specified  by  the  designer. 

The  weight  vector  of  the  antenna  array  system  is  constructed 
by  stacking  the  six  weight  vectors  of  the  individual  adaptive  filters. 
Denote  the  weight  vector  of  transversal  filter  k  at  time  j  by  the 
two  dimensional  vector  Wk(j);  the  weight  vector  W(j)  of  the  entire 
system  is  then  a  twelve  dimensional  vector: 

W(j)  =  [wj(j)  wj(j)  •••  wj(j)  •••  wj(j)].  (10-1) 

Construct  the  data  vector  X(j)  for  the  entire  system  similarly. 

The  soft  constraints  will  be  used  to  specify  desirable  antenna 
gains  in  a  particular  direction  at  a  specified  frequency. 

2  . 

Imagine  the  antenna  array  receiving  a  sinusoid  of  power  Cr  at 
frequency  u>r  from  direction  0r.  Denote  the  sinusoid  at  the  input  to 
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fwo-tap  adaptive 


Speed  of  Propogation  =  1. 
Sampling  Interval  =  .125 


USED  FOR  flOAPTRT I  ON 


Figure  10-1.  Adaptive  Antenna  Array  used  in 
Figures  10-2  through  10-12. 
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transversal  filter  TF,  at  time  j  by  the  phasor  C  exp{i[w  jT+$  ]},  where 
^rk  t*ie  sl9nal's  Phase  difference  between  sensor  k  and  some  arbi¬ 
trary  reference  point.  $rk  is  a  function  of  the  angle  of  arrival  of 
the  signal  (6r)  and  of  the  antenna  geometry.  In  this  case  the  data 
vector  is: 


X(j)  = 


Crexp{i[u)rjT+<t>rl]} 

Crexp{i[aJr(j-l)T+<j>rl]} 


Crexp{i[uirjT+ct!rg]} 
Crexp{i[u)r(j-1  )T+4>r6]} 


(data  in 

TF1 ) 


(10-2) 


(data  in 
TF6> 


Since  the  array  output  is  XT(j)W(j),  the  array  gain  to  this  signal  is 
X  (j  )W(  j  )/Crexp{iu)rjT}  =  [XT(j)/Crexp{iooroT}]W(j).  Define  a  vector 


Ar  by  X(  j  )/Crexp{iu»rjT> ; 


v4 


exp{i(J>rj} 
exp{i  (4>r1  -oorT) } 


exp{ii>r6> 
exp{i(<(>r6-urT)} 

The  array  gain  to  signal  r  at  time  j  is  AFW(j).  Suppose  it  is  desir¬ 


(10-3) 


able  that  the  array  gain  to  this  signal  be  D  exp{in  }.  Then  the 

r  r 
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constraint  is  written  as 


Ajw(j)  =  Drexp{inr>  »  (1C-4) 

which  is  made  a  soft  constraint.  But  W(j)  is  a  set  of  real  we. gits, 
while  Ar  and  Drexp{inr}  are  complex  quantities.  The  proposed  con¬ 
straint  can  still  be  specified  with  purely  real  values  by  sepc  \iri 09 
it  into  the  real  and  imaginary  parts: 

Re{A^}W(j)  =  Re{Drexp{inr>}  ,  (10-5) 

/ 

Im{Ar }W( j )  =  Im{Orexp{inr}}  .  (10-6) 

This  yields  two  constraints  which  are  used  as  soft  constraints.  Thus 

the  antenna  array  attempts  to  keep  a  complex  gain  of  Drexp{inr}  in 
direction  at  frequency  uy,  but  sincey£he  constraints  are  soft,  the 
gain  can  vary  from  the  specification  (Drexp{inr)) . 

This  procedure  can  be  followed  for  several  different  sinusoids, 
at  the  same  or  different  frequencies,  yielding  a  set  of  constraints. 
Form  the  set  of  constraint  vectors  (Ar)  into  a  matrix  A;  stack  the 
specified  gains  into  a  corresponding  vector  H.  Then  the  set  of  soft 
constraints  is: 


AW  =  H  .  (10-7) 

Weight  the  soft  constraints  by  constants  b  ,  which  compose  the  dia¬ 
gonal  weighting  matrix  B  used  in  the  algorithm  (8-8)  or  (8-7). 

Three  different  sets  of  constraints,  derived  as  shown,  are  listed 
in  Tables  1,  2,  and  3.  The  features  and  effects  of  each  set  of  con¬ 
straints  are  now  studied. 
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The  Effect  of  Table  1  Constraints 

Figures  10-2  through  10-6  show  the  first  example  of  the  use  of  soft 
constraints.  The  soft  constraints  specify  unity  power  gain  at  frequency 
2,  in  the  directions  0,  10,  120,  180,  -120,  -10  (in  degrees).  The 
constraints  are  weighted  equally  at  1.  Table  1  summarizes  these 
constraints. 


Direction 

of 

Constraint 

(degrees) 

0r 

Ampl itude 
of  Desired 

Gain  at 
Frequency  2. 

°r 

Phase 

of  Desired 
Gain 

(degrees) 

Constraint 

Weighting 

Factor 

tV 

0. 

1. 

180.0 

1. 

10 

1. 

177.3 

1. 

120 

1. 

-90.0 

1. 

180 

1. 

-180.0 

1. 

-120 

1. 

-90.0 

1. 

-10 

1. 

177.3 

1. 

Table  1  -  Constraint  Set  1 


Figure  10-2  shows  the  antenna  directivity  pattern  that  results  when 
the  constraint  equations  (10-7)  are  solved  for  the  weight  vector  which 
satisfies  them  exactly.  This  figure  is  a  plot  of  the  power  gain  that  a 
signal  at  a  frequency  of  2  receives,  as  a  function  of  the  arrival 
direction  of  the  signal,  when  the  weight  vector  is  the  solution  to  the 
constraint  equations. 

Figure  10-3  shows  the  antenna  directivity  pattern  resulting  when  a 
unity  power  sinusoid  at  frequency  2  is  received  from  0  degrees,  when  the 
antenna  array  system  has  adapted  to  the  point  of  convergence.  This 
example  (and  all  others  in  this  section)  also  has  an  isotropic  white 
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-  Constraint  (figures  next  to  constraint  points  are 
constraint  weighting  factors) 


Figure  10-2. 


Antenna  array  directivity  pattern  determined 
by  soft  constraints  listed  in  Table  1. 
(Weight  vector  frozen  at  the  solution  to  the 
constraint  equations.) 
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<m  -  Constraint  (figures  next  to  constraint  points  are 
constraint  weighting  factors) 


Figure  10-3.  Antenna  array  directivity  pattern  after  adaptation  by 
the  soft-constraint  IMS  algorithm  (8-8)  with: 

1)  A  sinusoid  of  power  1,  frequency  2,  from  0~ 

2)  Soft  constraints  of  Table  1 

3)  Isotropic  noise  of  power  0.1 


noise  field  impinging  on  the  antenna  array.  The  noise  power  at  each 
antenna  element  is  0.1.  Recall  that  the  goal  is  to  minimize  the  output 
power  while  trying  to  keep  the  constraint  errors  small  (i .e.  keep  the 
gain  in  the  constraint  direction  close  to  the  constraint  values).  It  can 
be  seen  that  in  the  signal's  direction  the  array  gain  has  decreased 
slightly  from  that  of  Fig.  10-2.  But  as  the  gain  in  the  signal  direction 
has  decreased,  the  constraint  error  in  that  direction  has  grown  (as  the 
constraint  errors  in  the  10  and  -10  degree  directions  have  also).  Thus 
the  soft  constraints  result  in  the  gain  in  the  signal  direction  remaining 
high,  keeping  the  constraint  errors  low.  The  constraint  errors  at  120, 
-120  and  180  degrees  are  kept  small  without  increasing  the  system  output 
power  significantly. 

Figure  10-4  shows  the  converged  antenna  array  directivity  pattern 
for  the  same  constraints,  when  a  unity  power  sinusoid  at  a  frequency  of 
2  is  received  from  60  degrees,  an  angle  not  near  any  of  the  constraints . 
When  the  adaptive  filters  reach  convergence,  the  signal  is  attenuated  by 
30dB,  while  the  constraint  error  remains  small. 

Figure  10-5  shows  the  antenna  array  directivity  pattern  for  the 
same  set  of  constraints  (Table  1)  when  the  unity  power  sinusoid  at  a 
frequency  of  2  is  received  from  120  degrees,  coincident  with  a  constraint 
The  attenuation  in  the  signal  direction  is  small  compared  to  that  of 
Figure  10-4,  but  is  greater  than  that  of  Figure  10-3,  in  which  the  signal 
was  arriving  close  to  three  constraints,  instead  of  only  a  single 
constraint. 

Figure  10-6  is  a  plot  of  the  converged  array  gain  in  the  signal 
direction,  for  all  possible  signal  arrival  directions.  This  plot  is 
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Constraint  (figures  next  to  constraint  points  are 
constraint  weighting  factors) 


Antenna  directivity  pattern  after  adaptation  by 
soft-constraint  LMS  algorithm  (8-8)  with: 

1)  A  sinusoid  of  power  1,  frequency  2,  from  60 

2)  Soft  constraints  of  Table  1 

3)  Isotropic  noise  of  power  0.1 


the 


Signal  Power  =  1. 


■m  -  Constraint  (figures  next  to  constraints  points  are 
constraint  weighting  factors) 


Figure  10-5.  Antenna  array  directivity  pattern  after  adaptation  by 
the  soft-constraint  IMS  algorithm  (8-8)  with: 

1)  A  sinusoid  of  power  1,  frequency  2,  from  120° 

2)  Soft  constraints  as  listed  in  Table  1 

3)  Isotropic  noise  of  power  0.1 
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Constraint  (figures  next  to  constraint  points  are 
constraint  weighting  factors) 


Figure  10-6.  Converged  array  gain  in  the  signal  direction  for  all 
possible  directions  of  signal  arrival. 

Condi tions: 

1)  Soft  constraints  of  Table  1 

2)  Adaptation  by  soft-constraint  LMS  algorithm  (8-8) 

3)  Sinusoid  power  =  1,  frequency  =  2 

4)  Isotropic  noise  of  power  0.1 
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obtained  by  placing  the  unity  power  signal  at  a  specified  direction, 
calculating  the  optimum  weight  vector  for  this  signal  configuration, 
using  this  optimum  weight  vector  to  calculate  the  gain  in  the  signal 
direction,  and  plotting  this  gain  as  a  single  point  in  Figure  10-6.  For 
example,  in  Figure  10-5  the  gain  in  the  signal  direction  •  12".  degrees' 
is  approximately  -6dB.  This  same  gain  is  plotted  on  Fig  r  ?.-  :  r  n  r.  c 
120  degree  position.  Figure  10-6  demonstrates  that  for  IK-  c. .  n  .  r  r n i 
specified  in  Table  1  the  gain  remains  high  in  directions  cr.s-.  t  , 
constraints,  but  signals  are  more  strongly  attenuated  when  not  c.oce  t, 
constraints. 

Effect  of  Table  2  Constraints 

Figure  10-7  shows  the  converged  array  gain  in  the  signal  direction. 


Direction 

of 

Constraint 

(degrees) 

9r 

Amplitude 
of  Desired 
Gain  at 
Frequency  2 

°r 

Phase 

of  Desired 
Gain 

(degrees) 

"r 

Constraint 

Weighting 

Factor 

br 

o 

1. 

180.0 

1 

4  * 

10 

1. 

177.3 

1  . 

120 

1. 

-90.0 

.01 

180 

1  . 

-180.0 

.01 

-120 

1. 

-90.0 

.01 

-10 

1. 

177.3 

1 . 

Table  2  -  Constraint  Set  2 


for  all  possible  directions  of  signal  arrival,  for  the  set  of  constraints 
in  Table  2.  These  constraints  differ  from  the  previous  constraints  in 
that  the  weightings  in  the  120,  -120,  and  180  degree  positions  are 
decreased  by  a  factor  of  100.  Figure  10-7  shows  the  effect:  the  array's 
gain  to  signals  arriving  from  directions  close  to  the  weak  constraints 
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-  Constraint  (figures  next  to  constraint  points  are 
constraint  weighting  factors 


7.  Converged  array  gain  in  the  signal  direction  for 
all  possible  directions  of  signal  arrival. 

Condi tions : 

1)  Soft  constraints  of  Table  2 

2)  Adaptation  by  soft-constraint  LMS  algorithm  (8-8 

3)  Sinusoid  power  =  1,  frequency  =  2 

4)  Isotropic  noise  of  power  0.1 


is  greatly  reduced  from  the  previous  case  (Fig.  10-6).  This  occurs 
because  the  output  power  is  significantly  reduced  by  decreasing  the  array 
gain  in  the  signal  arrival  direction,  while  incurring  only  small 
constraint  errors  duo  to  the  low  weighting  coefficient.  Thus  the  weight¬ 
ing  coefficients  control  the  "softness"  of  the  constraint.  A  large 
weighting  coefficient  implies  that  the  decrease  in  output  power  must  be 
large  to  allow  a  small  deviation  from  the  constraint;  a  small  weignting 
coefficient  implies  that  greater  deviation  from  the  constraint  is  allowed 
with  little  penalty,  so  the  algorithm  can  decrease  the  output  power 
significantly. 

Effect  of  Tab! e  3  Constrai nts 

Figure  10-8  demonstrates  the  effect  on  the  large  lobe  of  Figure  10-7 
when  the  two  constraints  at  10  and  -10  degrees  are  moved  to  60  and  -60 
degrees  and  simultaneously  weakened  by  a  factor  of  100.  Table  3  presents 
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Table  3  -  Constraint  Set  3 


this  set  of  constraints.  Comparing  Figures  10-7  and  10-8  shows  that 
when  two  strong  constraints  are  at  the  10  and  -10  degree  positions  as  in 
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m  -  Constraint  (figures  next  to  constraint  points  are 
constraint  weighting  factors) 


Figure  10-8.  Converged  array  gain  in  the  signal  direction  for 
all  possible  directions  of  signal  arrival. 

Conditions: 

1)  Soft  constraints  of  Table  3 

2)  Adaptation  by  soft-constraint  LMS  algorithm  (8-8) 

3)  Sinusoid  power  =  1,  frequency  =  2 

4)  Isotropic  noise  of  power  0.1 


Figure  10-7,  the  angular  sector  over  which  a  signal  is  received  without 
significant  attenuation  is  much  broader  than  when  only  a  single  strong 
constraint  is  present,  as  in  Figure  10-8. 

Antenna  Array  Gain  in  the  Constraint  Directions 

Figure  10-9  shows  the  gain  in  the  0  degree  direction  maintained  by 
the  soft  constraint,  for  all  possible  arrival  directions  of  a  unit  power 
signal  with  a  frequency  of  2,  for  the  constraints  of  Table  2.  The  plot 
is  calculated  by  placing  the  signal  at  a  given  direction,  calculating 
the  converged  weight  vector,  calculating  the  resulting  gain  at  0  degrees 
(frequency  of  2),  and  plotting  it  on  Figure  10-9.  The  gain  in  the  0 
degree  position  remains  close  to  the  unity  gain  specification,  decreasing 
only  when  the  signal  is  also  close  to  0  degrees.  When  the  signal  arrives 
from  close  to  0  degrees,  the  array  gain  in  the  0  degree  direction  drops 
slightly  to  reduce  the  system  output  power,  but  cannot  drop  significantly 
without  causing  large  constraint  errors. 

Figure  10-10  shows  the  array  gain  in  the  direction  of  the  much 
weaker  constraint  at  180  degrees  for  all  possible  arrival  directions  of  a 
unit  power  signal  with  a  frequency  of  2  (again  using  the  constraints  of 
Table  2).  Since  the  gain  in  this  direction  can  vary  greatly  without 
incurring  large  constraint  error  (no  strong  constraints  in  this  region), 
the  adaptive  array  concentrates  on  minimizing  output  power  rather  than  on 
maintaining  the  constraint,  as  seen  by  the  wide  variation  in  the  array's 
gain  in  this  direction. 

A  Two- Signal  Case 

Figures  10-11  and  10-12  show  a  two  signal  case.  Figure  10-11  shows 
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Figure  10-9.  Array  gain  in  the  zero  degree  direction  maintained  by 
the  soft  constraint,  when  array  has  been  adapted  to 
convergence  on  a  signal  arriving  from  the  direction 
specified  along  the  abscissa. 

Conditions : 

1)  Soft  constraints  of  Table  2,  as  listed 

2)  Adaptation  by  soft-constraint  algorithm  (3-8  ■ 

3)  Sinusoid  power  =  1,  frequency  =  2 

4)  Isotropic  noise  of  power  0.1 


Signal  Arrival  Direction 


Figure  10-10.  Array  gain  in  the  180  degree  direction  maintained  by 
the  soft  constraint,  when  array  has  been  adapted  to 
convergence  on  a  signal  arriving  from  the  direction 
specified  along  the  abscissa 

Condi tions: 

1)  Soft  constraints  of  Table  2,  as  listed 

2)  Adaptation  by  soft-constraint  algorithm  (8-8) 

3)  Sinusoid  power  =  1,  frequency  2 

4)  Isotropic  noise  of  power  0.1 
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Constraint  (figures  next  to  constraint  points  are 
constraint  weighting  factors) 


Figure  10-11: 


Antenna  array  directivity  pattern  after  adaptation  by 
the  soft-constraint  LMS  algorithm  (8-8)  with  two 
signal s 

Conditions: 

1)  Signal  #1  power  =  1,  frequency  =  2,  from  0' 

2)  Signal  #2  power  =  100,  frequency  =  2,  from  10° 

3)  Soft  constraints  as  listed  in  Table  2 

4)  Isotropic  noise  of  power  0.1 


!  r»» 


the  antenna  array  directivity  pattern  for  a  particular  signal  configura¬ 
tion,  with  the  Table  2  constraints.  The  signal  (#1)  arriving  from  0 
degrees  has  a  power  of  1;  the  signal  (#2)  arriving  from  10  degrees  has  a 
power  of  100.  Both  signals  have  a  frequency  of  2.  The  figure  shows 
that  the  strong  signal  (#2)  is  greatly  attenuated  even  though  it  is 
arriving  from  a  direction  where  a  constraint  is  located.  This  is  because 
the  signal  is  very  strong  compared  to  the  constraint  in  this  direction; 
the  array  concentrates  on  attenuating  the  signal  rather  than  on  satisfy¬ 
ing  the  constraint.  The  weak  signal  (#1)  arriving  from  0  degrees  is  only 
slightly  attenuated,  because  it  is  weak  in  comparison  with  any  of  the 
constraints  in  the  neighborhood.  Note  that  the  presence  of  the  strong 
signal  has  had  little  or  no  effect  on  the  array's  gain  to  the  weak  signal 
(compare  the  gain  with  Figure  10-3). 

Figure  10-12  shows  the  array’s  gain  to  the  weak  signal  (#1)  at  0 
degrees  as  a  function  of  the  arrival  direction  of  the  strong  signal  (#2). 
This  figure  shows  that  the  array's  gain  to  the  weak  signal  is  only 
affected  by  the  strong  signal  when  the  strong  signal  is  arriving  from  a 
direction  very  close  to  that  of  the  weak  signal.  When  the  strong  signal 
arrives  from  0  degrees  the  two  signals  are  inseparable;  the  array  acts  as 
if  there  were  only  one  strong  signal.  Since  this  composite  signal  is 
strong  compared  to  the  constraint  at  0  degrees,  the  array  concentrates  on 
attenuating  the  composite  signal.  As  the  strong  signal  moves  away  from 
the  weak  signal,  the  array  is  better  able  to  resolve  the  two  signals,  and 
continues  to  attenuate  the  strong  signal,  while  allowing  the  gain  in  the 
direction  of  the  constraint  at  0  degrees  to  increase  again. 
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Array  Gain  (in  dB)  to  Signal  #1 


Summary 

This  section  has  demonstrated  the  use  of  soft  constraints  for 
adaptive  antenna  arrays.  It  has  shown  that  soft  constraints  can 
maintain  array  gain  in  the  presence  of  signals  which  are  weak  compared 
to  the  constraints;  but  strong  signals  are  attenuated.  It  was  also 
seen  that  the  strength  of  constraints  could  be  varied,  and  placing 
constraints  closely  together  could  expand  the  angular  sector  over 
which  the  array  gain  is  maintained. 
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XI_.  OUTPUT  POWER  DUE  10  A  SIGNAL  AS  A  FUNCTION  Qf  US  INPUT  POWER 

This  section  investigates  the  power  at  the  output  of  a  converged 
soft-constraint  LMS  adaptive  filter,  due  to  a  particular  input  signal; 
relating  the  output  power  to  the  input  power. 

When  fixed  (non-adapting)  filters  are  used  for  signal  processing,  an 
increase  in  the  input  power  of  a  signal  always  means  a  corresponding 
increase  in  the  output  power  of  the  filter.  But  this  is  not  necessarily 
true  for  adaptive  filters.  An  increase  in  the  power  of  any  signal  can 
change  the  optimum  filter.  And  the  new  optimum  filter  might  attenuate 
the  signal  more  strongly  than  the  previous  optimum  filter.  It  is  possible 
that  the  increase  in  attenuation  is  so  great  that  the  signal's  increase 
of  input  power  is  more  than  cancelled;  so  it  is  possible  that  the  output 
power  due  to  the  signal  is  actually  less  than  before.  Thus,  when  a 
signal  increases  its  power  at  the  input  of  an  adaptive  filter,  the  output 
power  due  to  the  signal  can  actually  decrease.  This  phenomenon  is  studied 
in  this  section. 

Assume  that  the  time  sequence  u(j),  the  input  to  the  adaptive  filter, 
consists  of  a  stationary  signal  to  be  studied,  denoted  s(j);  and  that  all 
other  signals  in  u(j)  are  also  stationary,  and  when  summed  together  are 
denoted  by  n(j): 

u(j)  =  s(j)  +  n(j)  .  (11-1) 

Denote  the  autocorrelation  of  s(j)  when  it  is  in  the  tapped  delay  line 
2  2 

by  a  R^$,  where  a  is  the  power  of  s(j).  Denote  the  cross-correlation 


between  s(j)  and  the  desired  signal  d(j)  by  f(ct)Pds>  where  f( a)  is  a 
function  of  the  input  power  of  s(j).  (Several  expressions  for  f(a) 
are  studied  later  in  this  section.)  Assume  that  s(j)  and  n(j)  are 
uncorrelated: 

E(s( j)n(j)}  =  0  .  (11-2) 

Denote  the  autocorrelation  of  n(j)  when  in  the  tapped  delay  line  by 
Rfin,  and  the  cross  correlation  between  n(j)  and  d(j)  by  Pdn- 

With  these  definitions  the  complete  input  autocorrelation  matrix 

2 

is  a  Rss+Rfinj  and  the  complete  input  cross-correlation  with  the  desired 

signal  is  f(a  ^Pds+Pdn‘ 

Using  (6-3)  the  optimum  weight  vector  is: 

“opt  ■  ('■2?5s%„«TM)'1[f(p)Pds*Pdn^TfiH3  •  ("-3) 

For  ease  of  notation,  denote  Rnn+A^BA  by  IJ,  and  Pdn+ATBH  by  V;  also 
assume  that  U  is  a  matrix  of  full  rank.  This  yields: 

“opt  *  («%s+y'cf(«)pd5*v] .  m-4) 

The  output  power  of  the  adaptive  filter  due  to  the  input  signal 
under  study  s(j)  is: 

"out  =  “Tt,2Bss“  •  ("-5> 

Using  the  optimum  weight  vector  of  (11-4)  yields: 

pout  =  Cf(a)Pds+V]T(a2Rss^-1^s(c‘2pss+M)-1Cf(-)Pds+V3-  HI -6) 

Appendix  F  demonstrates  that  since  R$s  and  U  are  both  hermitian  mat¬ 
rices  and  IJ  is  nonsingular,  a  matrix  S  can  be  found  such  that: 
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sT4ss  =  r,  01-7) 

STUS  =  l  ,  (11-8) 


where  ¥  is  a  diagonal  matrix,  and  I_  is  the  identity  matrix.  (S  will 
be  a  purely  real  matrix,  since  both  and  U  are  purely  rea1.  Thus 
ST  =  S+.)  Rearranging  (11-7)  and  (11-8)  yields: 

R$s  =  S-V1  ,  (11-9) 

U  =  S-V1  ,  (11-10) 

where  (S^)”  is  abbreviated  to  S”"1".  Substituting  (11-9)  and  (11-10) 
into  (11-6)  yields: 


pout  =  C<r(ct)Pds+V]T(a2S-V1+£TIS-1)-1 

2  -T  -1 
—  a  S  'ys  '  ♦ 

(aVV'+S'V1  )_1  [f(a)Pds+V] 

=  [f(a)Pds+V]TS(a2Y+i)-1STa2S-TyS"1S(a2¥+n-1ST[f(a)Pds+V] 

=  [f(a)Pds+V]TS(a2i+i)-1a2l(a2¥+i)-1ST[f(a)Pds+V]  .  (11-11) 


Now  since  f  is  a  diagonal  matrix  (f  =  diag(i^. )),  pQut  can  be 
written  in  terms  of  individual  components  as: 


out 


n 

.^1 pouti 


n 

i  '  l 


2, 

a  ip 


i  =  l  (a  i^.+I ) 


(-rfiV(c)Pds*V])(  ■  01-12) 


where  {ST[f(a)Pds+V]}..  denotes  the  ith  element  of  the  vector 
ST[f(a)Pd$+V]. 


i 

* 

i 
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Thus  the  output  power  is  the  sum  of  a  set  of  components  PQut1- 

2 

which  vary  individually  as  the  input  power  a  of  the  signal  under 
study  is  varied.  The  value  of  an  individual  component  P0(jtl-  of  the 

2 

output  power  pQut  may  be  plotted  as  a  function  of  the  input  power  a 
once  f(a)  is  known.  Three  cases  are  of  particular  interest: 

Case  1_:  f(a)  =  a. 

The  signal  under  study  s(j)  is  correlated  with  the  desired  signal  d(j), 
but  the  power  of  d(j)  remains  constant  even  when  the  input  power  of 
s(j)  increases.  This  case  can  occur  when  d(j)  is  generated  separately 
from  s(j).  For  this  case,  a  component  of  (11-12)  has  the  form 


2 

a  •Pj  t  ? 

— 2 - ^aPds+V^i  • 

(a^+1)2  ds  1 


(11-13) 


The  shape  of  this  function  is  shown  in  Figure  11 -la.  The  figure 
shows  that  the  curve  can  have  one  of  two  forms,  depending  on  whether 
or  not  [STPds].  and  [S^V]..  have  the  same  sign.  When  the  signs  are  the 
same,  the  gain  of  the  filter  to  s(j)  increases  slightly  at  first,  then 
decreases  toward  zero  (as  seen  in  Fig.  11-lb.)  However,  the  rate  of 
decrease  of  gain  compared  to  the  rate  of  increase  of  input  power  is 
such  that  the  output  power  approaches  an  asymptotic  value  of 

(as  seen  in  Fig.  11-la).  The  decrease  occurs  because  s(j) 
begins  to  dominate  n(j)  and  overwhelm  the  soft  constraints,  so  the 
adaptive  filter  begins  to  do  power  equalization  to  make  the  power  of 
filter  output  y(j)  match  that  of  d(j).  For  opposite  signs,  Fig.  11-la 
shows  that  the  output  power  due  to  s(j)  can  increase,  then  decrease  to 
zero  again,  and  finally  increase  to  the  asymptotic  value.  The  reason 
for  the  decrease  is  that  initially  the  weight  vector  is  dominated  by 
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n(j)  or  the  soft  constraints,  and  the  estimate  of  d(j)  has  the  wrong 
sign  compared  to  d(j).  Fig.  11-lb  shows  that  the  filter's  gain  begins 
with  the  wrong  sign.  As  s(j)  grows,  it  has  more  effect  on  the  weight 
vector.  For  a  good  estimate,  the  sign  of  the  estimate  and  the  filter 
gain  must  change,  causing  the  output  power  to  go  through  zero  at  some 
point. 

Case  2:  f(a)  =  a2. 

The  signal  under  study  s(j)  is  correlated  with  the  desired  signal  d(j), 
and  d(j)  is  derived  from  s(j),  so  that  the  power  of  d(j)  increases 
linearly  with  an  increase  in  the  power  of  s(j).  An  example  of  this 
relationship  is  the  line  enhancer  configuration  [3,4, li].  In  this  case 
a  component  of  (11-12)  has  the  form 


2, 

a  ^ 


outi 


(c^.+l)2 


{ST[a2P 


ds+V^i 


(11-14) 


Figure  11 -2a  shows  the  shape  of  this  curve.  The  asymptote  of  the  output 
power  curves  (Fig.  ll-2a)  is  a  parabola.  When  [STPds].j  and  [S^V].  have 
the  same  sign,  the  output  power  curve  essentially  follows  the  parabola; 
the  soft  constraints  (and/or  n(j))  cause  the  weight  to  be  of  the  pro¬ 
per  sign  but  larger  than  necessary,  so  the  output  power  curve  is  above 
the  asymptote.  For  opposite  signs,  the  weight  must  again  change  signs, 
causing  the  dip  to  zero  output  power  as  seen,  then  increase  once  the 
proper  sign  is  obtained.  Fig.  ll-2b  shows  that  in  either  case  the  fil¬ 
ter  gain  approaches  a  constant. 

Case  3:  f(ot)  =  0. 

The  signal  under  study  s(j)  is  uncorrelated  with  the  desired  signal 
d(j).  This  occurs  when  s(j)  is  noise  or  interference.  In  this  case, 
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1 


Figure  11-2.  Soft-constraint  LMS  adaptive  filter  gain  to  a  signal 
component,  and  the  corresponding  signal  component 
output  power,  as  a  function  of  the  signal  component 

2 

input  power.  Case  2:  f(o)  =  a  . 


a  component  of  (11-12)  has  the  form 


pouti 


(ct^jj.j  +  1 )  ^ 


{STV>f  . 


(11-15) 


Figure  ll-3a  shows  the  shape  of  this  curve.  Here,  the  filter  begins 
to  turn  itself  off  as  s(j)  begins  to  dominate  n(j)  and  the  soft  con¬ 
straints.  This  curve  is  of  strong  interest  because  it  shows  that 
strong  signals  can  be  attenuated  much  more  than  weak  signals.  This 
phenomenon  could  be  used  to  create  adaptive  filters  which  pass  weak 
signals  but  attenuate  strong  signals,  effectively  filtering  signals 
based  on  their  strength.  The  quantity  iJk  determining  where  the  peak 
of  this  curve  occurs  is  under  some  control  by  the  filter  designer, 
since  selection  of  the  soft  constraints  affects  «!/.. 
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XII.  RELATION  OF  THE  SOFT-CONSTRAINT  LMS  ALGORITHM 


TO  OTHER  VERSIONS  OF  THE  LMS  ALGORITHM 


The  LMS  Algorithm 

The  LMS  algorithm  defined  in  references  [1-3]  is: 

W(j+1)  =  W(j)  +  2ye(j)X(j)  .  (12-1) 

By  comparing  (12-1)  with  the  soft-constraint  LMS  algorithm  (8-7)  it  is 
seen  that  the  LMS  algorithm  is  a  special  case  of  the  soft-constraint  LMS 
algorithm,  since  setting  =  0  in  (8-7)  yields  (12-1).  The  effect  of 
setting  B  =  0  is  that  all  of  the  soft  constraints  are  turned  off. 

The  Leaky  LMS  Algorithms 

By  examining  (8-8)  it  can  be  seen  that  the  "leaky"  LMS  algorithm 
[11-13]: 

W(j+1)  «  vW{ j )  +  2pe ( j ) X ( j )  (12-2) 

is  also  a  special  case  of  the  soft-constraint  LMS  algorithm.  The  leaky 
LMS  algorithm  has  a  multiplier  v  on  the  W(j)  term  which  is  a  positive 
scalar  less  than  one.  The  soft-constraint  LMS  algorithm  has  a  correspond¬ 
ing  term  I_-2pA^BA  which  is  a  matrix.  However,  if  A  is  chosen  to  be  an 
identity  matrix,  and  B_  is  diagonal  with  the  diagonal  elements  all  equal 
to  a  scalar  y,  then  the  multiplier  in  the  soft-constraint  LMS  algorithm 
(8-8)  reduces  to  the  scalar  I -2yy. 
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The  other  difference  between  the  leaky  LMS  algorithm  (12-2)  and  the 
soft-constraint  LMS  algorithm  (8-8)  is  the  presence  of  the  driving  term 
2pATBH  in  the  latter.  However,  by  choosing  H  to  be  zero,  this  term 
disappears.  Thus,  the  leaky  LMS  algorithm  is  seen  to  be  a  special  case 
of  the  soft-constraint  LMS  algorithm,  by  choosing  the  constraints  in  the 
latter  to  constrain  each  of  the  weights  to  zero,  with  identical  weight¬ 
ing  on  each  of  the  constraints. 

Zahm1 s  Algorithm 

Zahm's  algorithm  [14]  is: 

W(j+1)  =  vW(j)  -  2yy(j)X(j)  +  V  (12-3) 

where  V  is  a  constant  vector. 

Zahm's  algorithm  is  also  a  special  case  of  the  soft-constraint  LMS 
algorithm  (8-8),  when  the  matrices  A  and  in  the  latter  are  chosen  in 
the  same  manner  as  for  the  Leaky  LMS  algorithm,  and  a  non-zero  constraint 
vector  H  is  selected  such  that  V=2pATBH,  and  with  no  desired  signal 
available. 

Frost' s  Hard  Constraint  LMS  A1 gori thm 

Frost's  algorithm  [9]  is  extremely  similar  to  the  soft-constraint 
LMS  algorithm  (8-8)  because  the  constraints  are  identical.  The  only 
difference  is  that  Frost  requires  exact  solution  of  the  constraints  at 
all  times.  Intuitively,  one  would  feel  that  as  the  soft  constraints  are 
stiffened,  the  soft-constraint  LMS  algorithm's  solution  would  approach 
that  of  the  hard-constraint  LMS  algorithm.  This  is  true,  and  can  be 
stated  as  follows:  denote  the  optimum  weight  vector  for  Frost's  hard 
constraint  problem  by  Wh(_.  Now  consider  letting  the  weighting  matrix  B 
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on  a  set  of  soft  constraints  for  (8-8)  be  multiplied  by  a  scalar  y,  so 
that  the  true  weighting  is  yB.  Then  for  the  optimum  weight  vector  Wopt 
in  (8-8): 


lim  W 

y-w 


opt 


(12-4) 


This  relation  is  proved  as  Theorem  5  in  Appendix  G. 

Thus  the  optimum  weight  vector  of  the  soft  constraint  LMS  algorithm 
approaches  the  optimum  weight  vector  of  Frost's  hard  constraint  LMS 
algorithm  in  the  limit  as  the  hardness  (weighting)  of  the  soft  constraints 
goes  to  infinity. 
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XIII.  CONCLUSIONS  AND  DISCUSSION 

The  designers  of  adaptation  algorithms  usually  derive  the 
algorithms  to  minimize  estimation  error.  But  often  the  designer  has 
additional  criteria  for  the  algorithm  to  satisfy,  which  requires 
modification  of  the  algorithm.  The  underlying  concept  of  this  paper 
is  that  the  adaptation  algorithm  should  be  derived  from  a  function 
which  explicitly  includes  terms  involving  all  of  the  design  criteria. 

This  paper  has  demonstrated  the  principle  by  combining  a  set  of 
soft  linear  constraints  with  a  mean  square  error  criterion.  Once  the 
performance  function  was  so  defined,  the  soft-constraint  LMS  algorithm 
was  directly  obtained. 

It  has  been  proved  that  in  a  stationary  environment  and  when 
certain  conditions  are  satisfied,  the  soft-constraint  LMS  algorithm 
causes  the  filter  to  converge,  minimizing  the  performance  function. 

It  was  also  shown  that  setting  the  adaptation  constant  to  obey  the 
conditions  for  convergence  in  the  mean  was  not  always  sufficient  to 
obtain  good  behavior;  in  some  cases  more  restrictive  conditions  must 
be  observed.  Since  these  conditions  depend  upon  considerable  a  priori 
knowledge  of  the  environment,  which  is  generally  not  available,  an 
even  more  restrictive  condition  was  proposed  which  has  the  advantage 
of  depending  on  the  environment  only  in  that  the  total  input  power 
(a  measurable  quantity)  must  be  known. 

The  usefulness  of  the  soft-constraint  LMS  algorithm  has  been 
shown  by  applying  it  to  an  adaptive  antenna  array  (section  X).  The 
constraints  were  changed  in  strength,  with  resulting  changes  in  the 


directivity  pattern  of  the  array  and  in  its  response  to  incoming 
sinusoidal  signals.  This  example  demonstrated  the  effect  of  varying 
the  "stiffness"  of  a  constraint.  A  constraint  in  an  important  loca¬ 
tion  (direction  of  possible  signal  arrival)  can  be  stiffened  sg  t.h;t 
deviations  from  it  remain  small,  while  constraints  in  less  important 
locations  can  be  made  softer  so  that  larger  variations  are  permitted. 
This  may  in  some  cases  prove  advantageous  as  a  "trade-off"  in  return 
for  maintaining  a  close  approach  to  minimization  of  estimation  err;r. 
This  flexibility  of  the  soft-constraint  LMS  algorithm  could  be  an 
advantage  in  building  adaptive  antenna  arrays  which  attenuate  strong 
jammers  while  maintaining  reception  in  directions  where  desirable 
signals  are  expected  to  appear. 

This  paper  has  also  derived  a  relation  between  the  output  power 
of  a  signal  from  a  converged  soft  constraint  LMS  adaptive  filter  and 
the  signal  input  power.  This  relation  demonstrated  the  unexpected 
behavior  that  in  some  cases,  although  the  input  power  is  increasing 
monotonical ly,  the  output  power  could  increase,  then  decrease  to  zero, 
and  then  increase  again.  Another  interesting  case  was  shown  where  the 
output  power  increases  to  a  peak,  and  then  decreases  monotonically , 
while  the  input  power  is  increasing  monotonically.  It  is  possible  that 
useful  applications  of  these  output  power  phenomena  exist;  this  is  an 
area  for  future  research.  In  addition,  no  physical  interpretation  for 
the  matrix  V  used  in  the  development  has  been  presented;  some 
properties  of  f  are  given  at  the  end  of  appendix  F.  This  remains 
an  area  for  further  study. 
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fc. 


It  was  shown  that  the  LMS,  leaky  LMS,  and  Zahm's  algorithms  are 
all  special  cases  of  the  soft-constraint  LMS  algorithm.  Thus,  results 
for  the  soft-constraint  LMS  algorithm  also  hold  for  these  previous 
algorithms.  It  was  also  shown  that  the  optimum  solution  to  a  soft 
constraint  problem  approaches  the  optimum  solution  of  a  hard  constraint 
problem  as  the  stiffness  of  the  soft  constraints  goes  to  infinity. 

Thus  the  soft-constraint  LMS  algorithm  is  a  generalization  of 
several  existing  algorithms.  It  has  potential  usefulness  in  a  number 


APPENDIX  A 


PROOF  OF  THEOREM  1_:  CONVERGENCE  OF  THE  MEAN  WEIGHT  VECTOR 
OF  THE  SOFT-CONSTRAINT  LMS  ALGORITHM 

The  theorem  statement  is: 

Theorem  1_:  Convergence  of  the  Mean  Wei qht  Vector 

If  1)  The  soft-constraint  LMS  algorithm  (8-7)  or  (8-8)  produces 
a  weight  vector  sequence  W(j)  from  a  data  vector  sequence 
X(j)  and  a  desired  signal  sequence  d(j),  and  if 

2)  W(j)  and  X(j)  are  statistically  independent,  and  if 

3)  The  matrix  R+ATBA  is  nonsingular,  and  if 

4)  o  <  u  <  r-1—  , 

max 

Then  the  mean  weight  vector  converges  to  the  optimum  weight  vector: 

1 im  E{W(j )}  *  W  t  .  (A-l) 

j-*»  H 

The  proof  requires  an  expression  for  E{W(j)}.  The  update  algorithm 
for  W(j)  is  the  soft-constraint  LMS  algorithm  (8-8): 

W(j+1)  =  (I-2yATM)W(j)  +  2ye(j)X(j)  +  2yATBH  .  (A-2) 

Expanding  e(j)  using  (3-4)  and  regrouping  terms  yields: 

W(j+1)  =  U-2u[X(j)XT(j)+ATBA]JW(j)  +  2u[d(j)X(j)+ATBH]  .  (A-3) 


Taking  the  expectation  of  this  update  equation,  and  using  the  assumption 
that  X(j)  and  W(j)  are  independent  random  processes  yields: 


E{W( j+1 ) }  =  [I_-2p(R+ATBA)]E{W(j)}  +  2p(P+ATBH)  •  (A-4) 


This  recursion  equation  is  identical  (after  regrouping  of  terms)  to 
the  recursion  equation  which  is  obtained  for  the  weight  vector  when 
perfect  gradient  measurements  are  available  (7-2).  Thus  the  mean  of  the 
weight  vector  follows  the  trajectory  that  is  obtained  when  perfect 
gradient  measurements  are  available. 

Iterating  (A-4)  yields  the  relation: 

T  j-1  _  . 

E{W( j ) }  =  [J_-2u(R.+A'bA)]^W(°)+2)j  [I-2u(R+ArBA)]t]  (P+a'bh)  .  (A-5) 

t=0 


The  summation  can  be  replaced  by  use  of  the  matrix  identity: 

j-1 

2  M*  -  (I_-Mj)(I-M)_1  . 

t=0 


( A- 6 ) 


Using  this  identity  in  (A-5)  results  in: 

E{W( j ) }  =  [i-2y(R+ATBA)3JW(0) 

+  2u{I-[I-2u(R+ATBA)]j}U-[I-2u(R+ATBA)]}'1(P+ATBH) 

=  2u  ( R+ATBA )  ]  ^  [  W  ( 0 )  -  ( R+ATBA ) " 1  (P+ATBH)] 

+  (R+ATBA)_1(P+ATBH)  .  (A-7) 

Recalling  the  expression  for  the  optimum  weight  vector  (6-3)  and  sub¬ 
stituting  this  relationship  in  (A-7)  yields: 
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E(W(j)}  -  Cl-2y(R+ATM)]j[W(0)-Wopt]  +  Wopt  .  (A-8) 

Then  (A-l)  will  be  true  for  all  W(Q)  if  and  only  if 

lim  [i-2u(R+ATBA)]J  =  0  .  (A-9) 

j-*co 

This  is  true  if  and  only  if  the  magnitude  of  every  eigenvalue  of  the 
matrix  [I-2y(R+A^8A)]  is  less  than  one  (by  the  assumptior  of  nonsingular¬ 
ity,  every  eigenvalue  is  nonzero).  This  condition  is  written  as: 


|l-2uAt|  <1  t=l . n  (A- 1 0 ) 

where  A{  is  the  tLn  eigenvalue  of  the  matrix  R+A  BA .  This  condition 
will  be  satisfied  if  and  only  if 

0  <  u  <  r-  for  a11  t*  t=l ,  ....  n  .  (A-l 1 ) 

At 


Since 


1 

^max 


< 


for  all  t=l . n 


the  required  condition  is 


0  <  u  < 


1 

^max 


(A- 12) 


(A-l 3) 


When  this  condition  is  satisfied,  (A-9)  is  true,  so  that  under  the  stated 
assumptions  the  conclusion  (A-l)  of  Theorem  1  is  true. 


APPENDIX  B 


PROOF  OF  THEOREM  2j_  WEIGHT  VECTOR  COVARIANCE  MATRIX  RECURSION 

The  statement  of  Theorem  2  is: 

Theorem  2:  Weight  Vector  Covariance 

If 

1)  Theorem  1  holds,  and  if 

2)  W(j)  and  d(j)  are  statistically  independent,  and  if 

3)  d(j)  and  u(j)  are  gaussianly  distributed, 

then  the  recursion  equation  for  the  weight  vector  covariance  is 

CyyU+l)  =  cx-  2p  ( R+ATBA )  ]  Cww  ( j )  C1-  ( R+ATBA )  3 

+  4u2  RCww(j)R  +  RTrtCyytjjRj  -  REie2( j ) |w=^(  . } } 

+  CP-RW(j)3CP-RW(j)]T  .  (B-i) 

The  proof  begins  by  recalling  the  recursion  expressions  for  the 
weight  vector  W(j)  and  the  mean  weight  vector  W(j).  The  expression 
for  W(j)  from  (8-7)  is: 

W(j+1)  =  W(j)  +  2ue(j)X(j)  -  2yATB[AW(j)-H3  .  (B-2) 

The  expression  for  W(j)  from  rearranging  (A-4)  is: 

ff(j+ 1)  =  W(j)  +  2uP  -  2yRW(j)  -  2yATBAW(j)  +  2yATBH-  (B-3) 
Define  the  difference  between  the  weight  vector  and  its  mean  by: 

AW(j)  =  W( j )  -  W(j)  (B-4) 

Combining  (B-2)  and  (B-3)  results  in  a  recursion  equation  for  AW(j): 
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AW(j+l)  «  AW(j)  +  2y[d(j)X(j)-P] 

-  2y[X(j)XT(j)W(j)-RW(j)]  -  2yA^BAAW(j)  .  fB-E) 
Using  recursion  (B-5)  in  the  definition  of  a  covariance  matrix 
Is: 

T 

Cww(j+1)  =  E{[W( j+1 )-W( j+1 )][W( j+1 )-W( j+i )]  } 

=  Ei'AW(j+l)AWT(j+l)} 

T 

=  E{AW{j)AWT(j)  +  2yAW(j)[d(j)X(j)-P] 

-  2yAW(j)[WT(j)X(j)XT(j)-WT(j)R]  -  2yAW(j)AWT(j)AT3A 
+  2y[d(j)X(j)-P]AWT(j)  +  4y2[d(j)X(j)-P][d(j)X(j)-?]' 

-  4y2[d(j)X(j)-P][WT(j)X(j)XT(j)-WT(j)R] 

-  4u2[d(j)X(j)-P]AWT(j)ATBA 

-  2y[X(j)XT(j)M(j)-RW(j)]AWT(j) 

-  4y2[X(j)XT(j)W(j)-RW(j)][d(j)X(j)-P]T 

+  4y2[X(j)XT(j)W(j)-RW(j)][WT(j)X(j)XT(j)-WT(j)R] 

+  4y2[X(j)XT(j)W(j)-RW(j)]AWT(j)ATBA 

-  2uATBAAW(j)AWT(j)  -  4y2ATBAAW( j)[d( j)x( j)_p3T 
.  t.2ATBAAW(j)[WT(j)X(j)XT(j)-WT(j)R] 

•  :  -  3AtW( j )AWT( j)ATBA} 


(B-6) 


Now  using  the  assumptions  of  independence,  and  noting  that  E{AW(j)}=0, 
terms  2,  5,  8,  and  14  become  zero.  Regrouping  terms  then  yields: 

C^U+l)  =  Ur2p(R+ATBA)]Cww(j)[I-2u(R+ATBA)]  -  4u2RC^w(j)R 

+  E{4u2[d2(j)X(j)XT(j)-PPT] 

-  4u2[d(j)X(j)-P][WT(j)X(j)XT(j)-WTU)i] 

-  4v2[X(j)XT(j)W(j)-RW(j)][d(j)X(j)-P]T 

+  4u2[XU)XT(j),4j)WT(j)X(j)XT(j)-RW(j)WT;j)R]|..(B-7) 
Now  it  can  be  shown  that  for  W(j),  d(j),  and  X(j)  assumed  gaussian: 

E{d2( j)X( j)XT( j ) }  =  E{d2(j)}R  +  2PPT  ,  (B-8) 

E{d(j)X(j)WT(j)X(j)XT(j)}  =  PWT(j)R  +  RW(j)PT  +  RPTW(j)  ,  (B-9) 

E{X(j)XT(j)W(j)WT(j)X(j)XT(j)}  =  2RCWW( j)R 

+  2RW( j )WT( j )R 
+  B.Tr[C^w(j)R] 

+  RTr[W(j)WT(j)R]  .  (B-10) 

These  relations  are  shown  by  expanding  each  element  of  the  matrices  on 
the  left  hand  sides  individually  by  using  the  summations  implied  by  the 
matrix  notation  on  the  right  hand  sides,  applying  the  expression  for  the 
expectation  of  4  jointly  distributed  gaussian  random  variables 
(eg.  7.2-15  of  [23]),  and  reconstructing  the  matrices. 
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Now,  applying  relations  (B-8)  through  ( B- 1 0 )  to  equation  -  i . 
cancelling  terms,  and  regrouping  yields: 

C^U+U  =  [I-2p(R+ATBA)]Cww(j)Li-2y(R+ATBA)] 

+  ^{RCyyljjR  +  RMCy^jjRj 

+  R.[ £  : d 2 ( j ; >  -  2PTW(  j )  +  WT(j)RW(j;] 

+  [PPT-RW(j)PT-PWT(j)R+RWT(j)R][  .  (3-1 

Now,  the  mean  square  error  evaluated  at  the  mean  weight  vector  is 
E{d2(j)}-2PTW(j)+WT(j)RW(j).  Substituting  this  relation  yields  the 
theorem's  conclusion: 

Cyj+1)  =  [N2y(  R+ATBA)  3C^w ( J  )  [I- 2 u ( R+ATBA )  ] 

+  4y2{RCww(j)R  +  RTr[C^^( j )R] 
-^e2(j)|w=W(.)} 

+  [P-RW(j)][P-RW(j)]T|  .  ( B- 1 

This  concludes  the  proof  of  the  theorem. 
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APPENDIX  C 


PROOF  OF  THEOREM  3:  SUFFICIENT  CONDITIONS  FOR  BOUNDEDNESS 


OF  THE  TRACE  OF  THE  WEIGHT  COVARIANCE  MATRIX 


The  statement  of  Theorem  3  is: 


Theorem  3:  Sufficient  conditions  for  boundedness  of  the  trace  of 


the  trace  of  the  weight  vector  covariance  matrix 


1)  If  Theorem  2  holds,  and 


2)  if  a)  Ymax  +  YmaxTr[^  -XmaxXmin 


0  <  y  < 


X2.  +  y2  +  y  MR] 

min  'max  max  L- 


orifb)  Ymax  +  Wr[^  ±  Wmin 


0  <  y  < 


X  +  Y  +  y  Tr[R] 
max  Tmax  Tmax  L-J 


Then  Tr[C^(j)]  will  be  bounded  for  all  time. 

To  determine  conditions  under  which  the  weight  vector  covariance 
matrix  is  guaranteed  to  remain  finite,  consider  the  recursion  of  the 
trace  of  the  weight  covariance  matrix  (Eq.  9-7). 
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First,  for  any  matrices  Z,  F,  and  G: 

Tr[EFG]  =  Tr[GEF]  .  (C-l) 

Applying  this  relation  to  (9-7)  yields: 

>[CyW(j+D]  =  Tr{Cl-2u(R+ATBA)]Cl-2u(R+ATM)]CVJW(j)> 

+  ^{TrtRRCyylj)]  +  Tr[R]Tr[RCyy(j)  ] 

+  MR]E{e2(j)|w=g(j)} 

+  [P-RW(j)]T[P-RW(j)]}  .  (C-2) 

Now  Moschner  showed  (relation  2.10  in  [24])  that  for  F  a  real 
symmetric  matrix  and  G  a  positive  semidefinite  matrix  that: 

Tr[FG]  <  max{eig{F}}Tr[G]  ,  (C-3) 

Tr[FGj  >  min{eig{F_}}Tr[G]  .  (C-4) 

By  repeated  application  of  (C-3)  on  (C-2)  the  trace  of  C^j+l)  may  be 
bounded: 

Tr'Ccww(  j+i  ^  -  SmaxTr^WJ’^ 

+  ^2)ixTr^(^  +  W^TrCC^tJ)] 

+Tr[R]E{e2(j))w=g(j)) 

+  CP-RW(j)]T[P-RW(j)][.  (C-5 ) 
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where 


MC^U+l)]  <  (0^  +  4y  y^ax  +  y^rmnr^m 

+  4^Tr[R]E{e2(j)|w=g(j)} 

+  [P-RW(j)]T[P-RW(j)]} 


(C-6) 


emax  =  max{|eigU-2u(R+ATBA)}|} 


(C-7) 


Ym .  =  max{eig{R}} 


(C-8) 


This  inequality  fits  the  form: 


TrC£Uy(J+1)]  £  aTr[CJl.(j)]  +  c(j) 


(C-9) 


where  a  is  a  positive  constant,  and  c(j)  is  bounded  in  value.  From 
linear  system  theory  it  is  known  that  will  remain  bounded 
when  a  <  1.  Therefore  if 


2  ?  ?  ? 

6  +  4u  y  +  4u  y  TrTRl  <  1 

max  M  'max  M  'max  L-J 


(C-10) 


then  Tr[Cww(j)]  will  remain  bounded. 

To  evaluate  the  inequality  of  (C-10)  requires  knowledqe  of  8 

max 

8max  is  the  ibsolute  va^e  of  the  eigenvalue  of  [I-2u(R+ATBA) ]  which  has 
the  greatest  magnitude.  Now,  the  eigenvalues  of  [l-2p(R+ATBA)]  are 
1  -2uX.j ,  where  the  X^.  are  the  eigenvalues  of  R+A^BA .  The  maximum  eigen¬ 
value  is  l-2uXmin,  and  minimum  eigenvalue  is  l-2pXmax,  because  Xi  >  0. 
Thus  Bmax  is  the  absolute  value  of  one  of  these  expressions: 


max{  | l-2yX  .  | ,  |  l-2px  I }  . 


(C-ll) 
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Now,  since  Xmax  >  Xmin>  l-2yXmax  <  1-2^.  The  only  way  that 
M“2yXmax|  might  be  greater  than  1 1  - 2uXmi n |  is  when  l-2yXmflx  is  negative 
but  has  a  greater  magnitude  than  l-2yXmin.  Thus  the  requirement  is 

1_2y\nin  >  2yXmax_1  for  8max  =  Solvin9  this  Equality  for  y 

finally  yields  the  result  that 


1”2yAmin  5  if  y  1  x  +x~ . . 

max  min 


max 


(C-12) 


■^W  •  ff  *  i  r-fT— 

max  min 


These  expressions  for  Bmax  are  now  used  to  determine  where  (C-10) 
holds  true. 

Case  1 :  when  y  <  t - -  ,  substitution  for  B  in  (C-10) 

-  A _ +A__.  _  max 

results  in: 


max  min 


*  4AL  +  ““W®  <  1 


(C-l 3) 


which  can  be  solved  to  yield  the  additional  condition  on  y: 


M  < 


X  . 
min 


\ni  n+’Ymax+YmaxTr^ 


( C  - 1 4 ) 


Thus  if 


0  <  y  <  min 


X  . 
min 


'Xna,,ti"1n  ’  Xroin+1Lx+1WTr®. 


(C-l 5) 


then  a  <  1,  and  TrtC^j)]  is  guaranteed  to  remain  bounded. 
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Case  2:  when  y  > 


— tt -  ,  substituting  into  (C-10)  result: 

'A  maX 

max  mm 


in: 


‘^W’)2  *  4"2*L  +  4"2^axTr^  <  1 


(c-ie) 


which  yields  the  conditions  on  u  of: 


X 


u  < 


max 


^ax^max^max1^ 


(c-i; 


Thus  if 


A  +A  . 
max  min 


<  w  < 


max 


A2  +y  Tr[R] 

max  'max  'max  l- 


(C-li 


then  a  <  1,  and  Tr[Cy^(j)]  will  remain  bounded. 

The  conditions  on  y  derived  in  the  two  cases  may  be  manipulated  to 
yield  a  more  acceptable  form.  Note  that  if: 


2 

Ymax 


+Y 


max 


Tr[R]  <  A 


max^min 


( C  - 1 9  1 


then  the  condition  of  the  first  case  is  satisfied  when  y  is  in  the  range 


0  <  y  < 


min 


^min+Ymax+YmaxTr^ 


(C-20) 


and  the  conditions  for  the  second  case  are  never  satisfied.  This  proves 
the  first  hclf  of  the  theorem's  conclusion. 


Now  consider  the  situation  when 


Ymax+Ymax"'"r^-^  —  ^max^min 


(C-21 ) 
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Then  the  conditions  of  the  first  case  are  satisfied  when 


0  <  y  c 


^max+\nin 


(C-22) 


arid  the  conditions  of  the  second  case  are  satisfied  when 


X  +x  . 
max  min 


<  u  < 


max 


^ma  x^ma  x^ma  x  T  r  ^ 


(C-23) 


These  two  ranges  for  y  may  be  combined  to  yield  the  single  range  of 


0  <  y  < 


max 


Xmax%ax^maxTr^ 


(C-24) 


This  is  the  proof  of  the  second  half  of  the  theorem's  conclusion. 
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PROOF  OF  THEOREM  4:  NECESSARY  CONDITIONS  FOR  BOUNDEDNESS  OF  THE  TRACE 
OF  THE  WEIGHT  VECTOR  COVARIANCE  MATRIX 


The  statement  of  Theorem  4  is: 

Theorem  4_:  Necessary  condi tions  for  boundedness  of  the  trace  of  the 
weight  vector  covariance  matrix 

For  Tr[C^^(j)]  to  be  a  bounded  sequence,  it  is  necessary  that 

1)  Theorem  2  hold,  and  also  that 


2a)  When  y2 .  +  y  .  Tr[R]  >  X2 
'  'min  'mm  L- J  -  max  ’ 

A 

that  0  <  y  <  -  ; 

*max  +  ymin  +  YminTr^ 

b>  • 

that  0  £  y  <  —  ^  -  ■  — 

\1  Ymin  +  Ymin^r^— ^ 

c)  When  ♦  Y„,1nTr[RJ  <  „  , 


that  0  <  y  < 


min 


min 


‘min 


*  w® 
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To  find  a  necessary  condition  for  Tr[C^w(j)]  to  remain  bounded, 
consider  finding  a  sequence  6(j)  which  is  a  lower  bound  for  the  sequence 
Tr[C*wU)].  Then  the  sec)uence  5(j)  must  remain  bounded  for  Tr[Cww(j)], 
to  be  bounded,  since  if  6(j)  grows  without  bound,  and  5( j)<Tr[C^w(;: ) ], 
then  Tr[Cww(j)]  will  also  grow  without  bound.  Applying  (C-l)  and  then 
(C-4)  to  Eq.  (9-7)  in  a  similar  manner  as  in  Appendix  C  yields: 

TrCCww(J+‘?  )3  >  eminTr^W^^  +  4p  YminTr^WW^^ 


4y2{Tr[R]E{e2(j)|w=y, 

nTT/  •  \  T 


} 


W(j) 

+  [P-RW(j)  1  P-RW(j)] 


(D-l) 


where 


8min  =  mini | e i g 2u ( R+AT BA ) } | ) 


Ymin  =  min^ei9W}  • 


(D-2) 

(0-3) 


Define  the  recursion  for  6(j)  as 


6(j+1)  =  (6min+4y2Ymin+4y2YminTr[^])6(j) 

+  4u2[Tr[Cww(j)]E{e2(j)|w=y(j)} 

+  (P-RW(j) )T(P-RW(j) )] 


(0-4) 


and 


6(0 )  ■  Tr[Cww(0)] 


(0-5) 


83 


The  equation  for  6(j)  is  of  the  form: 


5(j+D  =  aS(j)  +  c(j)  (0-6) 

where  a  is  a  positive  constant,  and  c(j)  is  a  positive  bounded  sequence. 

Again  from  linear  system  theory,  5(j)  will  be  bounded  if  and  only  if 

a<l .  Evaluation  of  a  requires  knowledge  of  3  ^  Recall  that 

l-2uX  4  >  l-2pX  .  Therefore,  if  l-2yX  >  0,  8  .  =l-2uX  .  However, 

mm  max  max  min  max 

if  l-2uA  .  <  0,  then  3  .  =  2pX  .  -1.  Otherwise,  it  is  possible  that 

mm  mm  mm 


for  some  i,  l-2pXi-  =  0, 

30  w° 1n 

this  case.  Thuse 

if  u  <  2 

max 

®min 

j 

1 

if  u  1  0, 

^Ami  n 

l  0 

otherwise 

Case  1:  Consider  the  value  of  a  when  p  <  75^ -  .  Tnen  3_.._=1-2pa _ , 

-  —  mm  max 

max 

resulting  in 


a  =  (l-2yX  +  4p^y^ .  +  4p^y  .  Tr[R]  <  1 

'  M  max'  'mm  ^  'mm  L- J 


(D-8) 


Solving  for  u  yields 


u  < 


max 


^max^mi  n^min"^— ■ ^ 


( D-9 ) 


Combining  the  two  conditions  on  q  yields  the  result  that  if 


34 
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a  =  (2wXmin-1)2  +  4u2Ymin  +  4^minTr^  <  1 


(0-14) 


Solving  for  y  yields 


y  <c  P  p 

A  .  +-y  .  -t-y  .  Tr [R] 
min  min  'min  L-J 


(0-15) 


Thus,  if 


1  ...  Xmin 

^  <  2  ? 

^Amin  ^min+T  Jp[^] 

min  min  min  L— J 


(D-16) 


then  a  <  1 .  Note  that  if 


Xmi  n^mi  n+  ^mi  nTr  ^ 


(D-17) 


then  a  _>  1  whenever  y  >  -  . 

^Amin 


Grouping  these  results  for  the  three  caser  of  y  together  in  a 
manner  similar  to  that  used  in  the  proof  of  Theorem  3  yields  the  result: 


a  <  1 ,  if: 


1)  When  y2 .  +  y  .  Tr[Rl  >  A2  , 
'  'min  'min  L— J  —  max  » 


0  <  y  < 


Xmax  +  Ymin  +  Ymin"^— 


2)  When  A2.  <  y2.  +  y  .  Tr[R]  <  A2 

'  min  —  'min  'min  L— J  —  max  > 


0  <  y  < 


2\/Ymin  +  YminTr^ 
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If  none  of  these  conditions  are  satisfied,  then  a  >  1,  and  <S(j)  will  be 
an  unbounded  sequence,  which  implies  that  Tr[Cww(j)]  is  an  unbounded 
sequence.  Thus  satisfaction  of  the  above  conditions  is  a  necessary 
(but  not  sufficient)  condition  for  Tr[C^(j)]  to  be  a  bounded  sequence. 
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APPENDIX  E 


DERIVATION  OF  A  CALCULABLE  BOUND  ON  M  WHICH  SATISFIES  THE  CONDI  HONS 
FOR  CONVERGENCE  OF  THE  MEAN  WEIGHT  VECTOR  AND  GUARANTEED  BOUNDEDNESS 
OF  THE  WEIGHT  VECTOR  COVARIANCE  MATRIX 

Section  9  proposes  a  bound  on  p  which  is  calculable  arid  satisfie 
the  conditions  on  p  in  Theorems  1  and  3.  This  appendix  demons t '•ate; 
that  the  bound  satisfies  these  criteria. 

The  bound  (9-8)  proposed  is: 

0  <  v  <  - ] - j -  •  0 

3Tr[R]  +  Tr[A  BA] 

Satisfying  Theorem  1_. 

This  bound  satisfies  the  conditions  for  convergence  of  the  mean 
weight  vector  (Theorem  1)  since: 

__L_  >  _ ] _  >  _ 1 _  .  : 

Xmax  Tr[R]  +  Tr[ATBA]  3Tr[R]  +  Tr[ATBAj 

Thus  if 

0  <  u  <  - - - j —  •  U 

3Tr[R]  +  Tr[A  BA] 


then 


0  <  p  < 


1 

'''max 


satisfying  the  condition  on  p  of  Theorem  1. 


(C 
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Satisfying  Theorem  2* 

This  bound  (E-l)  also  satisfies  the  conditions  on  y  which  guarantees 
boundedness  of  the  weight  vector  covariance  matrix  (Theorem  3): 

Case  1  of  Theorem  3:  using  the  condition  of  case  1,  the  following 
inequality  may  be  written: 


min 


min 


X  +  X  X 

min  max  min 


-  ,2  .  2  . 

min  'max 


Y  „  Tr[R] 
'max  L~J 


Then 


\nin 


\nin  +  Xmax  X^.  +  y^.v  +  ymavTr[R] 

min  max  max  L— 


Tr[R]  +  Tr[ATBA] 


^min 


3Tr[R]  *  Tr[ATBfi]  ‘  4in  *  W  + 


Thus  if 


then 


0  <  y  < 


1 


3Tr[R]  +  Tr[A8A] 


0  <  U  <  -j  —  2 


^min 


*min  +  Ymax  +  Ymax"''r^ 


satisfying  the  condition  on  y  of  Case  1  of  Theorem  3. 


Case  2  of  Theorem  3:  Begin  with  the  equality: 


1 


,  ,  .  Ymax  ,  Ymaxx 

W  *  Ymax  >—  *  T— Tr[R3 
max  max 


max 


^max  +  Ymax  +  Ymax"^-^ 


Now  51  Ym , .. »  so 

ffldX  ~  msx 


1 


max 


?  7 

X  +  v  +  Tr[R]  X  +  Y  +  v  TrfRj 
max  max  L— 1  max  ?max  ’max  L-J 


Tr[R+A  BA]  +  Tr[R]  +  Tr[R] 


max 


Tr[ATM]  ♦  3Tr[R]  A^x  ♦  Y^ax  +  WTr[R] 


Thus  if 


0  <  y  < 


3Tr[R]  +  Tr[ATBA] 


(E-7) 


Then 


0  <  y  < 


max 


X^  +  y^  +  y  Tr[R] 
max  'max  max  L-J 


(E-8) 


satisfying  the  condition  on  y  of  case  2  of  Theorem  3. 

Since  both  cases  of  Theorem  3  are  satisfied,  the  proposed  bound 
(E-l)  on  y  will  guarantee  boundedness  of  the  weight  vector  covariance 
matrix. 
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T 

Tr[A  8A]  is  certainly  calculable,  since  the  matrices  are  all 
specified  by  the  designer.  Tr[R]  is  also  computable  as  discussed  in 
section  IX;  it  is  n  times  the  input  power  to  the  filter. 
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APPENDIX  F 


SIMULTANEOUS  DIAGONALIZATION  OF  TWO  HERMITIAN  MATRICES 


This  appendix  shows  that  a  transformation  exists  which  simultaneously 
diagonalizes  two  hermitian  matrices.  This  is  a  previously  solved  problem, 
included  nere  for  completeness  [25,26].  This  appendix  follows  'ict’e's  [25] 
development.  Gantmacher  [26]  arrives  at  the  same  results  by  a  different 
path. 


The  theorem  is  stated  as  follows:  Given  two  hermitian  matrices  A 
and  B  with  B  nonsingular,  a  matrix  S_  exists  such  that  S+BS  =  I_,  the 
identity  matrix  (where  S_+  denotes  the  complex  conjugate  transpose  of  the 
matrix  S_) ,  and  S+AS  =  y ,  where  f  is  a  diagonal  matrix. 

The  proof  begins  by  noting  that  for  any  hermitian  matrix  there  exists 
a  unitary  transformation  which  will  diagonalize  that  hermitian  matrix. 
Thus,  for  the  hermitian  matrix  B  there  exists  a  unitary  matrix  P  such  that 
P^BP  =  Vn,  where  y,,  is  the  diagonal  matrix  with  diagonal  elements  which 

—  D  ~ D 

are  the  eigenvalues  of  matrix  B. 

Denote  the  diagonal  matrix  which  has  as  elements  the  square  roots  of 

1  '2  i/p  +  1/2 

the  eigenvalues  of  B_  as  ,  yielding  the  relation  (j^  )  =  j^. 

Under  the  assumption  that  it,  (and  thus  B)  is  nonsingular,  is 

D  —  *-> 

1  /2 

also  nonsingular.  Using  the  inverse  of  fg  “  results  in  the  relation: 


[(^VrYBP^V1 


(c-d 


Thus  the  matrix 


transforms  3  to  an  identity  matrix. 
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Now  consider  applying  this  transformation  to  the  hermitian  matrix  A. 
This  would  result  in  the  matrix: 

C(^/Z)  +  r]P+ AP(4/2)_1  ■  (F-2) 

This  matrix  is  also  hermitian.  Therefore,  there  exists  a  unitary  matrix 
£  such  chat  the  resulting  unitary  transformation  diagonalizes  the  matrix 
of  expression  (F-2),  yielding: 

9.+[(4/2)+r1P+AP(4/2^’13-  =  1  ’  (P-3) 

where  f  is  a  diagonal  matrix.  Note  that  P.(lg  )  Q  is  a  transformation 
which  will  diagonalize  the  matrix  A.  Denote  this  transformation  by  S: 

s  *  p(4/2)_1i  •  (p-«) 

Now  apply  this  transformation  to  the  matrix  B. 

Since  P  was  originally  chosen  so  that  t^BP  s 

S+BS  =  fi+[(4/2)+]’1P+IP(lJ/2)"1S  •  (F-5) 

By  application  of  (F-l) 

s+bs  =  g.+g. 

■  I  ,  (P-6) 

since  £  is  a  unitary  transformation.  Therefore  the  matrix  $_  satisfies 
the  two  relationships 

S+BS  =  I  ,  (P-7) 

S+AS  =  y  .  (F-8) 
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1  /  2 

where  ¥  is  a  diagonal  matrix  and  =  P¥g'  where  P  is  a  unitary  matrix 

1  /2 

which  diagonalizes  J3,  ¥g  is  a  diagonal  composed  of  the  square  roots  of 
the  eigenvalues  of  E5,  and  Q  is  a  unitary  matrix  which  diagonalizes 

[(^Vj-Vap^2)-1. 

Although  deeper  knowledge  of  and  ¥  is  not  required,  the  following 
properties  can  be  shown: 

1)  The  columns  of  are  the  eigenvectors  of  the  matrix  B'^A 

2)  The  values  of  the  diagonal  elements  of  _¥  are  the  eigenvalues 
of  the  matrix  B'^A. 

These  properties  can  be  shown  as  follows:  Begin  with  the  relation: 

S+ AS  =  ¥  .  (F-9) 


Then 


AS  *  (S+)-1i  . 


( F  - 1 0 ) 


Now,  premultiply  by  B_1: 


Ef'AS  =  B“ 1  ( Sf )~'v  ■ 


(F-ll) 


An  expression  for  B"^(S_+)~^  can  be  found  by  considering  the  relation: 


S  BS  =  I  . 


(F-12) 


Inverting  both  sides  yields: 


(S+BS)_1  =  S’V1^)-1  =  I 


(F-131 


Then  premultiplying  by  S  yields: 

B’^sV1  =  S  .  ;p-14) 
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Substituting  this  relation  in  (F-11)  yields: 


which  shows  that  the 
diagonal  elements  of 


B_1AS  =  S'? 


(F-15) 


f 


columns  of  are  the  eigenvectors  of  B~^A  and  the 
V  are  the  corresponding  eigenvalues. 


APPENDIX  G 


PROOF  OF  THEOREM  5:  OPTIMUM  WEIGHT  VECTOR  FOR  SOFT-CONSTRAINT  IMS 
ALGORITHM  GOES  TO  OPTIMUM  WEIGHT  VECTOR  FOR  FROST'S  HARD  CONSTRAINT 

LMS  ALGORITHM 

The  theorem  statement  is: 

Theorem  _5 :  Soft  Constraint  Solution  Goes  to  Hard  Constraint  Solution 
If 

1)  The  weighting  matrix  in  the  soft  constraint  algorithm  is 
replaced  by  yB.»  so  that  the  weighting  on  the  constraints  may  be 
increased  simultaneously  by  increasing  the  scalar  y,  and  if 

2)  The  data  vector  autocorrelation  matrix  j?  is  nonsingular  and  if 

3)  The  weighting  matrix  B  is  nonsingular,  and  the  matrix  A  is  full 
rank,  then 


lim  W  =  W, 

°Pt  hc 


(G-l) 


Proof :  The  proof  begins  by  writing  W^^  in  terms  of  yB  from  (6-3): 


opt 


=  (R  +  yATBA)_1  (P  +  yATBH) 


Now  apply  the  matrix  inversion  lemma  (section  5.7  of  [25])  to  R  +  yATBA: 
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U  r,  =  [R'1  -  R“1AT(VI  +  AR'1AT)'1AR'1]P 
opt  1 —  —  —  y—  —  —  —  J 

+  [R'1  -  R'1AT(V1  +  AR_1AT)'' AR’]]yATBH 

The  conditions  of  the  lenria  are  guaranteed  to  be  satisfied  in  this 

use  and  in  the  next,  due  to  the  assumptions  made  on  R,  B_,  and  on  A. 

This  assumption  is  also  required  by  Frost  to  yield  a  unique  optimum 

weight  vector  for  his  algorithm.  Now  apply  the  matrix  inversion  lemma 

to  +  AR’W  in  the  second  term: 

Y~  —  ~ 

Wopt  =  [R'1  -  R'VlV1  +  AR"^AT  J'^AR"1  ]£_ 


{r-U'V  (AR'V 


- ( AR" 1 AT) ' 1 [yB+( AR" 1 AT) “ 1 ]“ 1 ( AR" 1 AT ) “ ^  AR" 1 }■ yATBH , 


(G-3) 


Now  factoring  out  R”1  as  a  postmultiplier  in  the  first  term  and  as 


a  premultiplier  in  the  second  term,  and  multiplying  the  A  factor  of 
aVh  in  the  second  term  through  yields: 

wopt  =  Cl  -  R'VcV1  +  AR”1AT)"1AJR‘1P 


(AR'V1)'1 


-  ( M' 1 A T ) ' 1  [ Y B  +  ( VT ) ' 1  ] ' 1  ( AR ' 1 A T ) ' 1  ]  M  V1} Y B H . 

( G-4 ) 


Since  (AR'V^)"1  (AR'^A1*)  =  I»  the  second  term  reduces,  yielding: 


Wopt  =  Cl  -  R-Vtlft-1  +  AR-1  AT )"1A]R"1P 

+  i" 1  {  A T  -  A  T  [l-  ( 1 A  T ) ' 1  C  y  i  +  ( M' 1 A T ) " 1  ] 


fBH 
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I 


Now, 


and 


So 


=  []_  -  R'1AT(V1  +  AR-"' AT ) ”^ AIR" 

+  R"1{aT-AT+Ai (AR'1AT)'1[yB+'AJ 

Tim  (V1  +  M~  V)"1  =  (AR" 

y-KXD  ^ 

lim  CyB  +  (AR"1AT)’1]'IyB  =  lim  (..? 

y-KO  y-i-CO 


lira  W  =  Li  -  R'1AT(AR‘1A') ’ 

y-K»  ^ 

+  R'1AT(AR'1AT)*1h. 


and  since  from  (2.8)  of  Frost  [9]  (when  rewritt 
of  this  paper) : 


w,  =  [i  -  r‘1at(ar"1at)‘1a; 

he  —  —  —  —  —  ~ 

+  R'1AT(AR“1AT)"1H  , 


the  conclusion  is 


1  im  W 

y-K» 


opt 


and  the  theorem  is  proved. 
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